Knowledge-Free Correlated Agreement for Incentivizing Federated Learning

Kentaroh Toyoda; Leon Witt; Lucy Klinger; Togrul Abbasli; Wojciech Samek

arxiv: 2605.04747 · v1 · submitted 2026-05-06 · 💻 cs.LG · cs.AI· cs.GT

Knowledge-Free Correlated Agreement for Incentivizing Federated Learning

Leon Witt , Togrul Abbasli , Kentaroh Toyoda , Wojciech Samek , Lucy Klinger This is my paper

Pith reviewed 2026-05-08 18:06 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.GT

keywords federated learningincentive mechanismtruthful reportingcorrelated agreementknowledge-free rewardsdecentralized incentiveslabel flipping

0 comments

The pith

KFCA rewards federated learning clients truthfully without ground truth or data distribution knowledge.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Knowledge-Free Correlated Agreement (KFCA) to assign rewards to participants in federated learning based solely on their submitted reports. This removes the need for a trusted test set, ground truth labels, or prior knowledge of the underlying data distribution. KFCA establishes strict truthfulness when reports consist of discrete categorical labels and an honest majority of clients exists, eliminating the label-flipping attacks that undermine earlier Correlated Agreement methods. The mechanism supports efficient real-time computation and is tested on federated tuning of large language model adapters as well as a practical printed circuit board inspection task, making it suitable for decentralized and blockchain-based incentive systems.

Core claim

KFCA computes client rewards by measuring agreement among categorical reports in a manner that requires no external validation data. The authors prove that, given an honest majority of clients, honest reporting becomes the strictly dominant strategy, because any deviation such as systematic label flipping reduces a client's expected reward relative to truthful peers.

What carries the argument

Knowledge-Free Correlated Agreement (KFCA), a reward function that derives payments directly from pairwise agreement statistics on discrete categorical reports without external references.

If this is right

Honest clients receive strictly higher expected rewards than those who flip labels.
Reward calculation runs in linear time relative to the number of clients, enabling on-chain execution.
The same mechanism applies across different model types including LLM adapters and industrial vision tasks.
No public test set or distribution statistics must be collected or shared.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could reduce central coordinator power in federated systems by moving reward logic fully on-chain.
Extending KFCA to continuous or probabilistic reports would require new correlation measures but might preserve truthfulness under similar majority assumptions.
Participation rates in federated learning could rise if clients expect provably fair compensation without revealing raw data.

Load-bearing premise

An honest majority of clients exists and all reports use discrete categorical labels.

What would settle it

Run a simulation or real federated training round in which more than half the clients coordinate to flip all their labels; if honest clients still receive higher rewards than the colluding group, the strict truthfulness claim fails.

Figures

Figures reproduced from arXiv: 2605.04747 by Kentaroh Toyoda, Leon Witt, Lucy Klinger, Togrul Abbasli, Wojciech Samek.

**Figure 1.** Figure 1: Illustration of the multi-task peerprediction mechanism. MTPP FL (KFCA-D / KFCA-QP) Task k Classify a test-set sample / decide update sign at parameter p of a model update ∆θ Latent truth Y k Correct class label / optimal update direction at p Signal Z k i Local classification of an image / sign(∆θi[p]) ∈ {−1, +1} Effort e k i = 1 Train local model / fine-tune LLM adapter (LoRA/DoRA) No effort e k i = 0 … view at source ↗

**Figure 2.** Figure 2: KFCA-QP vs. CA-QP under a sign-flip attack: (a) global test accuracy over FL rounds, (b) view at source ↗

**Figure 3.** Figure 3: KFCA on a public test set (KFCA-D) and on quantized parameter updates (KFCA-QP). view at source ↗

**Figure 4.** Figure 4: (a) Variance of reward distributions for KFCA, CA, and exact SV (top: public-dataset view at source ↗

**Figure 5.** Figure 5: Real-world PCB inspection deployment with KFCA-QP. (a) Hardware: PCB board with view at source ↗

**Figure 6.** Figure 6: Federated LLM adapter tuning with KFCA-QP. (a) Per-round pipeline: server distributes view at source ↗

**Figure 7.** Figure 7: Computation time vs. #Clients and #Peers. KFCA-QP on a simple CNN with 5,280 parameters and KFCA-D on a 10,000-unit dataset, measured on a single Apple Silicon M2 Pro (16 GB RAM). 28 view at source ↗

**Figure 8.** Figure 8: Reward distribution for individual clients view at source ↗

**Figure 9.** Figure 9: Empirical ∆ˆ matrices for quantized adapter updates across domains (rows) and rounds (columns). Each cell is a 2 × 2 matrix over {−1, +1}. All matrices satisfy the categorical-world condition: ∆(−1, −1), ∆(+1, +1) > 0 and ∆(−1, +1), ∆(+1, −1) < 0 view at source ↗

**Figure 10.** Figure 10: KFCA-QP rewards under various attack strategies across four federated LLM fine-tuning view at source ↗

**Figure 11.** Figure 11: Decentralized and Incentivized Federated Learning: Application of KFCA-D on view at source ↗

read the original abstract

We introduce Knowledge-Free Correlated Agreement (KFCA) to reward client contributions in federated learning (FL) without relying on ground truth, a public test set, or distribution knowledge. Under categorical reports and an honest majority, KFCA is strictly truthful, addressing the label-flipping vulnerability of Correlated Agreement (CA). We evaluate KFCA on federated LLM adapter tuning and a real-world PCB inspection task, showing efficient real-time reward computation suitable for decentralized and blockchain-based incentive designs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

KFCA provides a knowledge-free fix for label-flipping in FL incentives but depends on an honest majority assumption.

read the letter

The main point to take away is that KFCA is designed to provide truthful incentives in federated learning without any ground truth data or knowledge of the underlying distribution, and it specifically counters the label-flipping attack that affected the original Correlated Agreement approach. What the paper does well is to extend the correlated agreement idea into a knowledge-free version that can be used in decentralized settings like blockchain. They demonstrate efficient computation that works in real time, which is important for practical incentive systems. The experiments on federated LLM adapter tuning and a PCB inspection task give concrete examples of how it might apply to real problems, moving beyond pure theory. The soft spots are around the assumptions and verification. The strict truthfulness holds only with an honest majority and categorical reports, which is a strong condition that may not always be met in open networks. Without more details on the proof or direct tests against label flipping in the evaluations, it's difficult to assess how solid the guarantees are in practice. The paper could benefit from more analysis on what happens when the majority assumption is violated. This kind of work is useful for people designing incentive mechanisms for federated learning, particularly those interested in decentralized or blockchain-integrated systems. A reader looking for new ways to handle contribution rewards without trusted parties would find the construction and applications relevant. I would recommend putting this through peer review. The idea targets a clear issue in the area and includes practical evaluations, so it merits referee attention even if revisions are needed to strengthen the theoretical parts.

Referee Report

2 major / 1 minor

Summary. The paper introduces Knowledge-Free Correlated Agreement (KFCA) as an incentive mechanism for federated learning that rewards client contributions without ground truth, a public test set, or distributional knowledge. It asserts that under categorical reports and an honest majority of clients, KFCA is strictly truthful and resolves the label-flipping vulnerability present in the prior Correlated Agreement (CA) method. The work includes empirical evaluations on federated LLM adapter tuning and a real-world PCB inspection task, emphasizing computational efficiency for decentralized and blockchain-based deployments.

Significance. If the strict truthfulness guarantee holds under the stated assumptions, KFCA would provide a practical, knowledge-free approach to truthful incentive design in federated learning, enabling more reliable decentralized ML systems where verification resources are limited. The evaluations on LLM tuning and industrial inspection tasks indicate real-world applicability and efficiency, strengthening the case for adoption in privacy-sensitive or distributed settings.

major comments (2)

[Abstract] Abstract: The central claim that 'KFCA is strictly truthful' under categorical reports and honest majority is load-bearing for the contribution but is asserted without a proof sketch, theorem statement, or reference to a formal argument in the main text; this must be supplied to substantiate the result and distinguish it from CA.
[Evaluation] Evaluation sections: The manuscript reports results on LLM adapter tuning and PCB inspection but does not detail the specific metrics, attack simulations, or comparisons used to verify robustness against label-flipping or to confirm empirical suitability; without these, the claim of addressing CA vulnerabilities remains incompletely supported.

minor comments (1)

[Abstract] The abstract introduces KFCA and CA without a brief parenthetical expansion on first use of 'Correlated Agreement'; adding this would improve immediate readability for readers unfamiliar with the prior work.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback. The comments highlight important areas for strengthening the presentation of our results on KFCA. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and details.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'KFCA is strictly truthful' under categorical reports and honest majority is load-bearing for the contribution but is asserted without a proof sketch, theorem statement, or reference to a formal argument in the main text; this must be supplied to substantiate the result and distinguish it from CA.

Authors: We agree that the strict truthfulness guarantee is central to the contribution and requires clearer substantiation. The full manuscript contains Theorem 1 in Section 3, which formally states and proves strict truthfulness of KFCA under categorical reports and honest majority (with the complete proof in the appendix). To address the concern, we will add a concise proof sketch directly in the abstract and ensure an explicit forward reference to Theorem 1 appears in the introduction. This revision will also more sharply distinguish KFCA from prior CA by emphasizing the knowledge-free property. revision: yes
Referee: [Evaluation] Evaluation sections: The manuscript reports results on LLM adapter tuning and PCB inspection but does not detail the specific metrics, attack simulations, or comparisons used to verify robustness against label-flipping or to confirm empirical suitability; without these, the claim of addressing CA vulnerabilities remains incompletely supported.

Authors: We acknowledge that the evaluation sections would benefit from greater specificity to fully support the robustness claims. In the revised manuscript we will expand these sections to explicitly list the metrics (e.g., client reward correlation with true contribution and final model accuracy), describe the label-flipping attack model and simulation protocol in detail, and include side-by-side comparisons of KFCA versus the original CA method under the same attacks. These additions will provide direct empirical evidence that KFCA mitigates the label-flipping vulnerability while preserving efficiency. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces KFCA as a mechanism for rewarding contributions in federated learning without ground truth, with the strict truthfulness claim explicitly conditioned on categorical reports and an honest majority of clients. This is presented as following from the mechanism design under those assumptions to address label-flipping in prior CA, without any reduction of predictions to fitted inputs, self-definitional loops, or load-bearing self-citations that would make the result equivalent to its inputs by construction. The evaluation on LLM tuning and PCB inspection is empirical and separate from the theoretical claim. No quoted equations or steps in the abstract or summary exhibit the forbidden patterns of circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

From abstract alone, the central claim rests on the honest-majority and categorical-reports assumptions; no free parameters, invented entities, or additional axioms are identifiable without the full text.

axioms (2)

domain assumption Honest majority of clients
Required for the strict truthfulness guarantee stated in the abstract.
domain assumption Reports are categorical
Explicitly required for the KFCA truthfulness property per the abstract.

pith-pipeline@v0.9.0 · 5383 in / 1223 out tokens · 39611 ms · 2026-05-08T18:06:48.854229+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Cost.FunctionalEquation / Foundation.BranchSelection washburn_uniqueness_aczel — no overlap; RS's Δ-like structure is the cosh/J cost, not a probability covariance matrix unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Δ(a,b) := P(Z1=a, Z2=b) − P(Z1=a)P(Z2=b) ... Δ(a,a)>0, Δ(a,b)<0 for a≠b (categorical-world condition).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

[1]

high-value contribution

or cost (data collection, training, etc) Pandey et al. [2020], Kang et al. [2019], Ding et al. [2020], Tang and Wong [2021]. Yet reward systems based on such simplified assumptions may not be applicable in any real-world scenario as the dominant strategy for an individual-rational agent is dishonest behavior (report the best possible outcome without costl...

work page 2020
[2]

Centralization requirement.Estimating ∆ requires access to all client reports, violating the decentralized and privacy-preserving nature of FL

work page
[3]

Computational inefficiency.Per pair, the server scans m joint reports to populate an L×L count table and derive ˆ∆ (O(m+L 2)); across n 2 pairs, O(n2(m+L 2)) per round, dominated byO(n 2m)in FL wherem≫L 2

work page
[4]

17 A.2.2 Worked label-flipping example Substituting the CA score matrix from Definition 2.4 into Eq

Delayed reward computation.Because ∆ must be globally estimated, CA cannot compute payments in real time, limiting deployability in online or blockchain-based implementations. 17 A.2.2 Worked label-flipping example Substituting the CA score matrix from Definition 2.4 into Eq. (1), the expected reward under deterministic reporting functionsf 1, f2 : [L]→[L...

work page 2013
[5]

Payment feasibility.The mechanism can assign (e.g., monetary) rewards to clients based on their reports

work page
[6]

The utility for clientiis: Ui =E i −c(e i),wherec(1)> c(0) = 0, withE i the expected total payment across tasks, andc(e i)the total effort cost

Risk-neutrality.Clients are risk-neutral and only care about their own payment and effort cost. The utility for clientiis: Ui =E i −c(e i),wherec(1)> c(0) = 0, withE i the expected total payment across tasks, andc(e i)the total effort cost

work page
[7]

Binary effort.Each client i∈N independently decides whether to exert effort on each task k∈M, denoted: ek i ∈ {0,1},where1 =effortful,0 =shirking

work page
[8]

Clients cannot distinguish tasks before seeing their signals

Ex-ante identical tasks.All tasks are drawn independently from a common prior dis- tribution over ground-truth labels. Clients cannot distinguish tasks before seeing their signals. • Signal generation.Each task k∈M has an unknown ground-truth label Y k ∈[L] . Each clientireceives a private signalZ k i ∈[L], sampled based on their effort level: P(Z k i =a|...

work page 2016
[9]

Tasks are ex-ante identical and randomly assigned; clients cannot distinguish between bonus and penalty tasks

work page
[10]

Clients receive conditionally independent signals Z1, Z2 ∈[L] , given the latent label Y∈[L]

work page
[11]

Define thedelta matrix∆∈R L×L as: ∆(a, b) :=P(Z 1 =a, Z 2 =b)−P(Z 1 =a)P(Z 2 =b), which is symmetric and mean-centered: ∆(a, b) = ∆(b, a), X b ∆(a, b) = X a ∆(a, b) = 0

Clients apply deterministic reporting functionsf 1, f2 : [L]→[L]. Define thedelta matrix∆∈R L×L as: ∆(a, b) :=P(Z 1 =a, Z 2 =b)−P(Z 1 =a)P(Z 2 =b), which is symmetric and mean-centered: ∆(a, b) = ∆(b, a), X b ∆(a, b) = X a ∆(a, b) = 0. The CA mechanism defines the scoring rule: SCA(a, b) := 1,if∆(a, b)>0, 0,otherwise. Proof.Letf 1, f2 : [L]→[L]be arbitrar...

work page
[12]

up to label permutation

Uninformed strategies get zero:Suppose f1 is uninformed, e.g., f1(a) =r for all a∈[L] . Then: E(f1, f2) = X a,b ∆(a, b)· S CA(r, f2(b)) = X b SCA(r, f2(b))· X a ∆(a, b) = 0, becauseP a ∆(a, b) = 0for allb. Therefore the truthful strategy achieves maximal expected reward. Equality holds only for strategy profiles that preserve all positively correlated sig...

work page 2016
[13]

Signals are conditionally independent given the latent truth

work page
[14]

This sign structure ensures that agreement reflects shared latent truth and enables KFCA to remain incentive-compatible and knowledge-free even in complex settings

Each client’s signal is informative about the latent truth, we canenforcea categorical structure through a transformation pipeline that maps arbitrary outputs into categorical representations, producing a correlation matrix e∆(a, b) =P( eZ1 =a, eZ2 =b)−P( eZ1 =a)P( eZ2 =b), which satisfies the categorical sign condition: sign(e∆(a, b)) = >0,ifa=b, <0,ifa̸...

work page
[15]

Clients decode this truth via maximum a posteriori (MAP) inference: eZ k i = arg max a∈[L′] P(Y k =a|Z k i )∝π(a)P i(Z k i |Y k =a), 23 where π(a) is a prior over latent states

Latent-truth alignment.Introduce a latent categorical variable Y k ∈[L ′] representing the true state for each task. Clients decode this truth via maximum a posteriori (MAP) inference: eZ k i = arg max a∈[L′] P(Y k =a|Z k i )∝π(a)P i(Z k i |Y k =a), 23 where π(a) is a prior over latent states. This step ensures cross-client consistency: reports with the s...

work page
[16]

better than random

Categorical regularization.When signals are noisy or sparsely distributed, we apply a smoothing step to the estimated correlation matrix: e∆(a, b) := sign(E[∆(a, b)])· |E[∆(a, b)]| γ , γ∈(0,1). This transformation preserves sign structure while enhancing diagonal dominance, ensuring thatsign(e∆) =I. Regularization note.This smoothing step is a modeling de...

work page
[17]

Gradient magnitudes may differ, but signs tend to agree on most coordinates—the optimum lives in the same direction for everyone

Shared optimization surface.Clients descend thesameloss from thesameglobal check- point. Gradient magnitudes may differ, but signs tend to agree on most coordinates—the optimum lives in the same direction for everyone

work page
[18]

Conditional on the global checkpoint and the optimum Y , signals are independent

Conditional independence by construction.Within a round, clients run local SGD without inter-client communication. Conditional on the global checkpoint and the optimum Y , signals are independent. A.9.3 Mapping from non-IID scenarios toα i Non-IID scenario Mechanism Effect onα i Label skew (Dirichlet αdir) Imbalanced class gradients on shared layers Highe...

work page
[19]

This ensures that no value is left unallocated

Efficiency/Pareto Optimality:The total payout to all players in the coalition N should be equal to the total value generated by the coalition, i.e., X i∈N ϕi(v) =v(N). This ensures that no value is left unallocated

work page
[20]

This ensures equal pay for equal contribution

Symmetry:If two players i and j contribute equally to all coalitions they are part of, they should receive the same payoff, i.e., if v(S∪ {i}) =v(S∪ {j}) for all S⊆N\ {i, j} then ϕi(v) =ϕ j(v). This ensures equal pay for equal contribution. 1Let there be four clients 1, 2, 3, and 4. Given the permutationΠ = 4213, thenS Π 3 ={4,2,1}. 26 Table 4: Accuracyv(...

work page
[21]

This axiom ensures that players that do not contribute to any coalition receive nothing

Null Player:If a player i does not contribute any value to any coalition it is part of, the player should receive a payoff of 0, i.e., if v(S∪ {i}) =v(S) for all S⊆N\ {i} , then ϕi(v) = 0. This axiom ensures that players that do not contribute to any coalition receive nothing

work page
[22]

This axiom ensures the Shapley value is consistent when games are combined

Additivity:If u and v are characteristic functions, then the payoff for a player under the sum of these games is equal to the sum of the player’s payoffs under each of these games, i.e., ϕ(u+v) =ϕ(u) +ϕ(v). This axiom ensures the Shapley value is consistent when games are combined. A.11.2 Shapley Value in Federated Learning Let A be a learning algorithm a...

work page 2020
[23]

However, the evaluating entity has to have access to gradients of all participating clientsi∈N

Avoiding retraining: Shapley value computation can be expedited by reconstructing submod- els θS from gradients instead of retraining the model on DS for every subset S (equation 8), according to v(S) =V (θglobal + X i∈S Di |DS|∆θi), Dpub ! ,(9) omitting computationally expensive training. However, the evaluating entity has to have access to gradients of ...

work page
[24]

Avoiding low-impact calculations: The marginal contribution of a client heavily depends on its position in the permutation Π. As the marginal utility usually decreases the later it joins the subset,truncatingthe Shapley value calculation if its marginal contribution is not significantly different from the previous one, e.g.|v(DN)−v(S Π i )| ≤ϵ significant...

work page
[25]

Reducing subsets STo further improve the efficiency of Shapley value computation, instead of going over all possible permutations, Monte Carlo estimation Rubinstein and Kroese

work page
[26]

[2019], Wang et al

can be applied to randomly sample permutations Π Ghorbani and Zou [2019], Jia et al. [2019], Wang et al. [2020a], Liu et al. [2022a], and then calculate the expected SV according to ϕi =E π∼π [V(S π i ∪ {i})−V(S π i )](10) whereπis the uniform distribution over all permutationsN!. Despite these optimization techniques, it still requires 3N Monte Carlo sim...

work page 2019
[27]

Shapley Value:Original SV calculation for FL client contribution evaluation, based on Equation 7

work page
[28]

GTG-Shapley:Liu et al. [2022a,b] Utilizes sub-model reconstruction with gradient updates from clients, guided sampling, in-round truncation of client permutations, and between- round truncation that drops an entire round of SV calculation if the remaining marginal utility or accuracy gain is small

work page
[29]

TMC-Shapley:Ghorbani and Zou [2019] Estimates the Shapley values by employing Monte Carlo sampling of permutations and selectively truncating the sub-model training and evaluations of irrelevant FL clients

work page 2019
[30]

[2019] Uses Shapley differences instead of Shapley values, with the original Shapley value derived from the Shapley differences by solving a feasibility problem

Group Testing:Jia et al. [2019] Uses Shapley differences instead of Shapley values, with the original Shapley value derived from the Shapley differences by solving a feasibility problem

work page 2019
[31]

[2019] In each round of the FL process, reconstructs the model of a subset of participants using their gradient updates

MR:Song et al. [2019] In each round of the FL process, reconstructs the model of a subset of participants using their gradient updates. The final SV for a participant is obtained by summing up their SVs across all rounds

work page 2019
[32]

[2020b] A group testing-based estimation approach

Fed-SV:Wang et al. [2020b] A group testing-based estimation approach. Performance of subsets used for estimating Shapley differences is evaluated on a sub-model reconstructed using participants’ model parameters, and Shapley values are independently estimated in each round and subsequently aggregated

work page
[33]

TMR:Wei et al. [2020] A gradient update-based method for SV calculation, incorporating a decay factor to include SV from previous rounds and a truncation factor to omit unimportant sub-model reconstructions

work page 2020
[34]

[2021] Calculates the delta matrix based on all quantized model parameters of clients

Correlated Agreement (CA-QP):Lv et al. [2021] Calculates the delta matrix based on all quantized model parameters of clients

work page 2021
[35]

10.KFCA-QP:Simplified CA based on quantized model parameters of clients

CA-D:Liu and Wei [2020] Calculates the delta matrix using all labels in the public dataset. 10.KFCA-QP:Simplified CA based on quantized model parameters of clients

work page 2020
[36]

KFCA-D:Simplified CA on the Test Dataset, assuming the delta matrix is the identity matrix and rewards are based on predictions on the test set

work page
[37]

A.11.7 Experiments: Additional Details MNIST Shapley-value comparison setup (details).We follow the federated evaluation protocol of Wei et al

MC KFCA-QP:Monte Carlo version of KFCA-QP, randomly choosing X parameters Y times out of all parameters and averaging the rewards. A.11.7 Experiments: Additional Details MNIST Shapley-value comparison setup (details).We follow the federated evaluation protocol of Wei et al. [2020] and use a simple CNN (21,840 parameters) trained on MNIST Deng [2012]. The ...

work page 2020
[38]

Case 1 (i.i.d., same size):Each client receives 10,840 images sampled to be balanced across digits

work page
[39]

Clients 1–2 have 80% digits1/2and 20% spread over remaining digits; clients 3–4 are skewed to3/4, etc

Case 2 (label skew, same size):Each client still has 10,840 images. Clients 1–2 have 80% digits1/2and 20% spread over remaining digits; clients 3–4 are skewed to3/4, etc

work page
[40]

labels):All clients have the same label distribution, but different dataset sizes with ratios: 10% (clients 1–2), 15% (3–4), 20% (5–6), 25% (7–8), 30% (9–10)

Case 3 (size skew, i.i.d. labels):All clients have the same label distribution, but different dataset sizes with ratios: 10% (clients 1–2), 15% (3–4), 20% (5–6), 25% (7–8), 30% (9–10)

work page
[41]

Case 4 (label noise, same size):We flip labels at increasing rates: 0% (clients 1–2), 5% (3–4), 10% (5–6), 15% (7–8), 20% (9–10)

work page
[42]

Methods compared.We compare KFCA to CA variants and to efficient Shapley value estimators; a brief description of each baseline is provided in Subsubsection A.11.6

Case 5 (feature noise, same size):We add Gaussian feature noise at increasing rates: 0% (clients 1–2), 5% (3–4), 10% (5–6), 15% (7–8), 20% (9–10). Methods compared.We compare KFCA to CA variants and to efficient Shapley value estimators; a brief description of each baseline is provided in Subsubsection A.11.6. 29 #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Clients 0.0...

work page 2024
[43]

/DoRA Liu et al. [2024]), updating W∈R d×k via W+ ∆W with ∆W=BA , A∈R r×k, B∈R d×r, r≪min(d, k) (and DoRA further decomposes weights into magnitude and direction) usually with no shared evaluation/test set available. We validate KFCA-QP on the FlowerTune LLM Leaderboard Flower Labs [2026], which— to the best of our knowledge—constitutes a first-of-its-kin...

work page 2024
[44]

• Stale Update (R1):The client submits the round-1 adapter update ∆θ1 s for all subsequent rounds, ignoring local training thereafter

Temporal Manipulation Attacks. • Stale Update (R1):The client submits the round-1 adapter update ∆θ1 s for all subsequent rounds, ignoring local training thereafter. •Lagged Update (t-k):The client submits∆θ t−k s in roundt, wherek∈ {2,3,4,5}. 31 Table 6: FlowerTune evaluation suites and downstream performance after 10 federated rounds (winner configurati...

work page 2021
[45]

• Zero Attack:The client submits an all-zero update ∆θt s =0

Free-Rider Attacks. • Zero Attack:The client submits an all-zero update ∆θt s =0 . After sign quantization, this maps to a constant report, breaking correlation with honest peers. • Random Attack:The client submits random noise ∆θt s ∼ N(0, σ 2I) scaled to match the honest update’s variance

work page
[46]

• Sparse Attack (p%):The client reports honestly for p% of coordinates and replaces the remaining(100−p)% with random values

Partial Manipulation Attacks. • Sparse Attack (p%):The client reports honestly for p% of coordinates and replaces the remaining(100−p)% with random values. We testp∈ {25,50,75}. •Sign Flip Attack:The client inverts all signs, i.e.,∆θ t s ← −∆θ t s. 32 Results.Figure 10 reports the KFCA-QP rewards across all four domains using peer-to-peer comparison (clie...

work page 2022
[47]

Optionally, clients lock a stake (Sybil resistance / slashable misbehavior) or registration is permissioned (known consortium participants)

Client registration (identity & rules).Clients register a public address in a smart contract and authenticate via signature. Optionally, clients lock a stake (Sybil resistance / slashable misbehavior) or registration is permissioned (known consortium participants)

work page
[48]

This step remains off-chain

Local training (off-chain).Clients train locally under the agreed FL specification (model, optimizer, rounds, etc.). This step remains off-chain

work page
[49]

Figure 11 illustrates the KFCA-D instance

Public evaluation signal (client-side).Given a public reference set X pub, each client produces a report: predicted labels (KFCA-D) or a quantized update / adapter signature (KFCA-QP). Figure 11 illustrates the KFCA-D instance

work page
[50]

Commitment (privacy-preserving commit–reveal).Raw reports are not posted on-chain. Instead, clients store the report off-chain (e.g., IPFS Benet [2014]) and commit to it on-chain via ci =H(CID i ∥s i), 33 R egis tr ation on bl ock chain Mod el tr aining Data A d dr ess Bl ock chain Priv at e, Public In t erplane tary Fil e Sys t em Clien ts In f er ence o...

work page 2014
[51]

To avoid manipulable randomness, a decentralized VRF (e.g., Chainlink VRF) outputs a verifiable random seed used to sample a pair of committed clients uniformly at random

Unbiased pairing via VRF (oracle randomness).KFCA requires pairing clients. To avoid manipulable randomness, a decentralized VRF (e.g., Chainlink VRF) outputs a verifiable random seed used to sample a pair of committed clients uniformly at random

work page
[52]

The contract verifies H(CID i ∥s i) =c i before accepting the report pointer

Reveal & verification.The selected clients reveal (CIDi, si). The contract verifies H(CID i ∥s i) =c i before accepting the report pointer. This guarantees integrity: the revealed report matches the earlier commitment

work page
[53]

On-chain KFCA scoring (randomized set selection).Using VRF-derived randomness, the contract samples the index sets required by KFCA (bonus set Mb and penalty sets M1, M2), fetches the corresponding report entries from the off-chain object (or via an agreed data-availability interface), and computes the KFCA score for the paired clients transparently

work page
[54]

Funds are released atomically to client addresses; if staking is enabled, provable protocol violations can trigger slashing

Automatic payout (and optional slashing).Scores are mapped to payments by the protocol- defined transfer rule. Funds are released atomically to client addresses; if staking is enabled, provable protocol violations can trigger slashing. Prototype status and scope.We implemented a proof-of-concept KFCA contract in Solidity for the EVM 2. A full decentralize...

work page

[1] [1]

high-value contribution

or cost (data collection, training, etc) Pandey et al. [2020], Kang et al. [2019], Ding et al. [2020], Tang and Wong [2021]. Yet reward systems based on such simplified assumptions may not be applicable in any real-world scenario as the dominant strategy for an individual-rational agent is dishonest behavior (report the best possible outcome without costl...

work page 2020

[2] [2]

Centralization requirement.Estimating ∆ requires access to all client reports, violating the decentralized and privacy-preserving nature of FL

work page

[3] [3]

Computational inefficiency.Per pair, the server scans m joint reports to populate an L×L count table and derive ˆ∆ (O(m+L 2)); across n 2 pairs, O(n2(m+L 2)) per round, dominated byO(n 2m)in FL wherem≫L 2

work page

[4] [4]

17 A.2.2 Worked label-flipping example Substituting the CA score matrix from Definition 2.4 into Eq

Delayed reward computation.Because ∆ must be globally estimated, CA cannot compute payments in real time, limiting deployability in online or blockchain-based implementations. 17 A.2.2 Worked label-flipping example Substituting the CA score matrix from Definition 2.4 into Eq. (1), the expected reward under deterministic reporting functionsf 1, f2 : [L]→[L...

work page 2013

[5] [5]

Payment feasibility.The mechanism can assign (e.g., monetary) rewards to clients based on their reports

work page

[6] [6]

The utility for clientiis: Ui =E i −c(e i),wherec(1)> c(0) = 0, withE i the expected total payment across tasks, andc(e i)the total effort cost

Risk-neutrality.Clients are risk-neutral and only care about their own payment and effort cost. The utility for clientiis: Ui =E i −c(e i),wherec(1)> c(0) = 0, withE i the expected total payment across tasks, andc(e i)the total effort cost

work page

[7] [7]

Binary effort.Each client i∈N independently decides whether to exert effort on each task k∈M, denoted: ek i ∈ {0,1},where1 =effortful,0 =shirking

work page

[8] [8]

Clients cannot distinguish tasks before seeing their signals

Ex-ante identical tasks.All tasks are drawn independently from a common prior dis- tribution over ground-truth labels. Clients cannot distinguish tasks before seeing their signals. • Signal generation.Each task k∈M has an unknown ground-truth label Y k ∈[L] . Each clientireceives a private signalZ k i ∈[L], sampled based on their effort level: P(Z k i =a|...

work page 2016

[9] [9]

Tasks are ex-ante identical and randomly assigned; clients cannot distinguish between bonus and penalty tasks

work page

[10] [10]

Clients receive conditionally independent signals Z1, Z2 ∈[L] , given the latent label Y∈[L]

work page

[11] [11]

Define thedelta matrix∆∈R L×L as: ∆(a, b) :=P(Z 1 =a, Z 2 =b)−P(Z 1 =a)P(Z 2 =b), which is symmetric and mean-centered: ∆(a, b) = ∆(b, a), X b ∆(a, b) = X a ∆(a, b) = 0

Clients apply deterministic reporting functionsf 1, f2 : [L]→[L]. Define thedelta matrix∆∈R L×L as: ∆(a, b) :=P(Z 1 =a, Z 2 =b)−P(Z 1 =a)P(Z 2 =b), which is symmetric and mean-centered: ∆(a, b) = ∆(b, a), X b ∆(a, b) = X a ∆(a, b) = 0. The CA mechanism defines the scoring rule: SCA(a, b) := 1,if∆(a, b)>0, 0,otherwise. Proof.Letf 1, f2 : [L]→[L]be arbitrar...

work page

[12] [12]

up to label permutation

Uninformed strategies get zero:Suppose f1 is uninformed, e.g., f1(a) =r for all a∈[L] . Then: E(f1, f2) = X a,b ∆(a, b)· S CA(r, f2(b)) = X b SCA(r, f2(b))· X a ∆(a, b) = 0, becauseP a ∆(a, b) = 0for allb. Therefore the truthful strategy achieves maximal expected reward. Equality holds only for strategy profiles that preserve all positively correlated sig...

work page 2016

[13] [13]

Signals are conditionally independent given the latent truth

work page

[14] [14]

This sign structure ensures that agreement reflects shared latent truth and enables KFCA to remain incentive-compatible and knowledge-free even in complex settings

Each client’s signal is informative about the latent truth, we canenforcea categorical structure through a transformation pipeline that maps arbitrary outputs into categorical representations, producing a correlation matrix e∆(a, b) =P( eZ1 =a, eZ2 =b)−P( eZ1 =a)P( eZ2 =b), which satisfies the categorical sign condition: sign(e∆(a, b)) = >0,ifa=b, <0,ifa̸...

work page

[15] [15]

Clients decode this truth via maximum a posteriori (MAP) inference: eZ k i = arg max a∈[L′] P(Y k =a|Z k i )∝π(a)P i(Z k i |Y k =a), 23 where π(a) is a prior over latent states

Latent-truth alignment.Introduce a latent categorical variable Y k ∈[L ′] representing the true state for each task. Clients decode this truth via maximum a posteriori (MAP) inference: eZ k i = arg max a∈[L′] P(Y k =a|Z k i )∝π(a)P i(Z k i |Y k =a), 23 where π(a) is a prior over latent states. This step ensures cross-client consistency: reports with the s...

work page

[16] [16]

better than random

Categorical regularization.When signals are noisy or sparsely distributed, we apply a smoothing step to the estimated correlation matrix: e∆(a, b) := sign(E[∆(a, b)])· |E[∆(a, b)]| γ , γ∈(0,1). This transformation preserves sign structure while enhancing diagonal dominance, ensuring thatsign(e∆) =I. Regularization note.This smoothing step is a modeling de...

work page

[17] [17]

Gradient magnitudes may differ, but signs tend to agree on most coordinates—the optimum lives in the same direction for everyone

Shared optimization surface.Clients descend thesameloss from thesameglobal check- point. Gradient magnitudes may differ, but signs tend to agree on most coordinates—the optimum lives in the same direction for everyone

work page

[18] [18]

Conditional on the global checkpoint and the optimum Y , signals are independent

Conditional independence by construction.Within a round, clients run local SGD without inter-client communication. Conditional on the global checkpoint and the optimum Y , signals are independent. A.9.3 Mapping from non-IID scenarios toα i Non-IID scenario Mechanism Effect onα i Label skew (Dirichlet αdir) Imbalanced class gradients on shared layers Highe...

work page

[19] [19]

This ensures that no value is left unallocated

Efficiency/Pareto Optimality:The total payout to all players in the coalition N should be equal to the total value generated by the coalition, i.e., X i∈N ϕi(v) =v(N). This ensures that no value is left unallocated

work page

[20] [20]

This ensures equal pay for equal contribution

Symmetry:If two players i and j contribute equally to all coalitions they are part of, they should receive the same payoff, i.e., if v(S∪ {i}) =v(S∪ {j}) for all S⊆N\ {i, j} then ϕi(v) =ϕ j(v). This ensures equal pay for equal contribution. 1Let there be four clients 1, 2, 3, and 4. Given the permutationΠ = 4213, thenS Π 3 ={4,2,1}. 26 Table 4: Accuracyv(...

work page

[21] [21]

This axiom ensures that players that do not contribute to any coalition receive nothing

Null Player:If a player i does not contribute any value to any coalition it is part of, the player should receive a payoff of 0, i.e., if v(S∪ {i}) =v(S) for all S⊆N\ {i} , then ϕi(v) = 0. This axiom ensures that players that do not contribute to any coalition receive nothing

work page

[22] [22]

This axiom ensures the Shapley value is consistent when games are combined

Additivity:If u and v are characteristic functions, then the payoff for a player under the sum of these games is equal to the sum of the player’s payoffs under each of these games, i.e., ϕ(u+v) =ϕ(u) +ϕ(v). This axiom ensures the Shapley value is consistent when games are combined. A.11.2 Shapley Value in Federated Learning Let A be a learning algorithm a...

work page 2020

[23] [23]

However, the evaluating entity has to have access to gradients of all participating clientsi∈N

Avoiding retraining: Shapley value computation can be expedited by reconstructing submod- els θS from gradients instead of retraining the model on DS for every subset S (equation 8), according to v(S) =V (θglobal + X i∈S Di |DS|∆θi), Dpub ! ,(9) omitting computationally expensive training. However, the evaluating entity has to have access to gradients of ...

work page

[24] [24]

Avoiding low-impact calculations: The marginal contribution of a client heavily depends on its position in the permutation Π. As the marginal utility usually decreases the later it joins the subset,truncatingthe Shapley value calculation if its marginal contribution is not significantly different from the previous one, e.g.|v(DN)−v(S Π i )| ≤ϵ significant...

work page

[25] [25]

Reducing subsets STo further improve the efficiency of Shapley value computation, instead of going over all possible permutations, Monte Carlo estimation Rubinstein and Kroese

work page

[26] [26]

[2019], Wang et al

can be applied to randomly sample permutations Π Ghorbani and Zou [2019], Jia et al. [2019], Wang et al. [2020a], Liu et al. [2022a], and then calculate the expected SV according to ϕi =E π∼π [V(S π i ∪ {i})−V(S π i )](10) whereπis the uniform distribution over all permutationsN!. Despite these optimization techniques, it still requires 3N Monte Carlo sim...

work page 2019

[27] [27]

Shapley Value:Original SV calculation for FL client contribution evaluation, based on Equation 7

work page

[28] [28]

GTG-Shapley:Liu et al. [2022a,b] Utilizes sub-model reconstruction with gradient updates from clients, guided sampling, in-round truncation of client permutations, and between- round truncation that drops an entire round of SV calculation if the remaining marginal utility or accuracy gain is small

work page

[29] [29]

TMC-Shapley:Ghorbani and Zou [2019] Estimates the Shapley values by employing Monte Carlo sampling of permutations and selectively truncating the sub-model training and evaluations of irrelevant FL clients

work page 2019

[30] [30]

[2019] Uses Shapley differences instead of Shapley values, with the original Shapley value derived from the Shapley differences by solving a feasibility problem

Group Testing:Jia et al. [2019] Uses Shapley differences instead of Shapley values, with the original Shapley value derived from the Shapley differences by solving a feasibility problem

work page 2019

[31] [31]

[2019] In each round of the FL process, reconstructs the model of a subset of participants using their gradient updates

MR:Song et al. [2019] In each round of the FL process, reconstructs the model of a subset of participants using their gradient updates. The final SV for a participant is obtained by summing up their SVs across all rounds

work page 2019

[32] [32]

[2020b] A group testing-based estimation approach

Fed-SV:Wang et al. [2020b] A group testing-based estimation approach. Performance of subsets used for estimating Shapley differences is evaluated on a sub-model reconstructed using participants’ model parameters, and Shapley values are independently estimated in each round and subsequently aggregated

work page

[33] [33]

TMR:Wei et al. [2020] A gradient update-based method for SV calculation, incorporating a decay factor to include SV from previous rounds and a truncation factor to omit unimportant sub-model reconstructions

work page 2020

[34] [34]

[2021] Calculates the delta matrix based on all quantized model parameters of clients

Correlated Agreement (CA-QP):Lv et al. [2021] Calculates the delta matrix based on all quantized model parameters of clients

work page 2021

[35] [35]

10.KFCA-QP:Simplified CA based on quantized model parameters of clients

CA-D:Liu and Wei [2020] Calculates the delta matrix using all labels in the public dataset. 10.KFCA-QP:Simplified CA based on quantized model parameters of clients

work page 2020

[36] [36]

KFCA-D:Simplified CA on the Test Dataset, assuming the delta matrix is the identity matrix and rewards are based on predictions on the test set

work page

[37] [37]

A.11.7 Experiments: Additional Details MNIST Shapley-value comparison setup (details).We follow the federated evaluation protocol of Wei et al

MC KFCA-QP:Monte Carlo version of KFCA-QP, randomly choosing X parameters Y times out of all parameters and averaging the rewards. A.11.7 Experiments: Additional Details MNIST Shapley-value comparison setup (details).We follow the federated evaluation protocol of Wei et al. [2020] and use a simple CNN (21,840 parameters) trained on MNIST Deng [2012]. The ...

work page 2020

[38] [38]

Case 1 (i.i.d., same size):Each client receives 10,840 images sampled to be balanced across digits

work page

[39] [39]

Clients 1–2 have 80% digits1/2and 20% spread over remaining digits; clients 3–4 are skewed to3/4, etc

Case 2 (label skew, same size):Each client still has 10,840 images. Clients 1–2 have 80% digits1/2and 20% spread over remaining digits; clients 3–4 are skewed to3/4, etc

work page

[40] [40]

labels):All clients have the same label distribution, but different dataset sizes with ratios: 10% (clients 1–2), 15% (3–4), 20% (5–6), 25% (7–8), 30% (9–10)

Case 3 (size skew, i.i.d. labels):All clients have the same label distribution, but different dataset sizes with ratios: 10% (clients 1–2), 15% (3–4), 20% (5–6), 25% (7–8), 30% (9–10)

work page

[41] [41]

Case 4 (label noise, same size):We flip labels at increasing rates: 0% (clients 1–2), 5% (3–4), 10% (5–6), 15% (7–8), 20% (9–10)

work page

[42] [42]

Methods compared.We compare KFCA to CA variants and to efficient Shapley value estimators; a brief description of each baseline is provided in Subsubsection A.11.6

Case 5 (feature noise, same size):We add Gaussian feature noise at increasing rates: 0% (clients 1–2), 5% (3–4), 10% (5–6), 15% (7–8), 20% (9–10). Methods compared.We compare KFCA to CA variants and to efficient Shapley value estimators; a brief description of each baseline is provided in Subsubsection A.11.6. 29 #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Clients 0.0...

work page 2024

[43] [43]

/DoRA Liu et al. [2024]), updating W∈R d×k via W+ ∆W with ∆W=BA , A∈R r×k, B∈R d×r, r≪min(d, k) (and DoRA further decomposes weights into magnitude and direction) usually with no shared evaluation/test set available. We validate KFCA-QP on the FlowerTune LLM Leaderboard Flower Labs [2026], which— to the best of our knowledge—constitutes a first-of-its-kin...

work page 2024

[44] [44]

• Stale Update (R1):The client submits the round-1 adapter update ∆θ1 s for all subsequent rounds, ignoring local training thereafter

Temporal Manipulation Attacks. • Stale Update (R1):The client submits the round-1 adapter update ∆θ1 s for all subsequent rounds, ignoring local training thereafter. •Lagged Update (t-k):The client submits∆θ t−k s in roundt, wherek∈ {2,3,4,5}. 31 Table 6: FlowerTune evaluation suites and downstream performance after 10 federated rounds (winner configurati...

work page 2021

[45] [45]

• Zero Attack:The client submits an all-zero update ∆θt s =0

Free-Rider Attacks. • Zero Attack:The client submits an all-zero update ∆θt s =0 . After sign quantization, this maps to a constant report, breaking correlation with honest peers. • Random Attack:The client submits random noise ∆θt s ∼ N(0, σ 2I) scaled to match the honest update’s variance

work page

[46] [46]

• Sparse Attack (p%):The client reports honestly for p% of coordinates and replaces the remaining(100−p)% with random values

Partial Manipulation Attacks. • Sparse Attack (p%):The client reports honestly for p% of coordinates and replaces the remaining(100−p)% with random values. We testp∈ {25,50,75}. •Sign Flip Attack:The client inverts all signs, i.e.,∆θ t s ← −∆θ t s. 32 Results.Figure 10 reports the KFCA-QP rewards across all four domains using peer-to-peer comparison (clie...

work page 2022

[47] [47]

Optionally, clients lock a stake (Sybil resistance / slashable misbehavior) or registration is permissioned (known consortium participants)

Client registration (identity & rules).Clients register a public address in a smart contract and authenticate via signature. Optionally, clients lock a stake (Sybil resistance / slashable misbehavior) or registration is permissioned (known consortium participants)

work page

[48] [48]

This step remains off-chain

Local training (off-chain).Clients train locally under the agreed FL specification (model, optimizer, rounds, etc.). This step remains off-chain

work page

[49] [49]

Figure 11 illustrates the KFCA-D instance

Public evaluation signal (client-side).Given a public reference set X pub, each client produces a report: predicted labels (KFCA-D) or a quantized update / adapter signature (KFCA-QP). Figure 11 illustrates the KFCA-D instance

work page

[50] [50]

Commitment (privacy-preserving commit–reveal).Raw reports are not posted on-chain. Instead, clients store the report off-chain (e.g., IPFS Benet [2014]) and commit to it on-chain via ci =H(CID i ∥s i), 33 R egis tr ation on bl ock chain Mod el tr aining Data A d dr ess Bl ock chain Priv at e, Public In t erplane tary Fil e Sys t em Clien ts In f er ence o...

work page 2014

[51] [51]

To avoid manipulable randomness, a decentralized VRF (e.g., Chainlink VRF) outputs a verifiable random seed used to sample a pair of committed clients uniformly at random

Unbiased pairing via VRF (oracle randomness).KFCA requires pairing clients. To avoid manipulable randomness, a decentralized VRF (e.g., Chainlink VRF) outputs a verifiable random seed used to sample a pair of committed clients uniformly at random

work page

[52] [52]

The contract verifies H(CID i ∥s i) =c i before accepting the report pointer

Reveal & verification.The selected clients reveal (CIDi, si). The contract verifies H(CID i ∥s i) =c i before accepting the report pointer. This guarantees integrity: the revealed report matches the earlier commitment

work page

[53] [53]

On-chain KFCA scoring (randomized set selection).Using VRF-derived randomness, the contract samples the index sets required by KFCA (bonus set Mb and penalty sets M1, M2), fetches the corresponding report entries from the off-chain object (or via an agreed data-availability interface), and computes the KFCA score for the paired clients transparently

work page

[54] [54]

Funds are released atomically to client addresses; if staking is enabled, provable protocol violations can trigger slashing

Automatic payout (and optional slashing).Scores are mapped to payments by the protocol- defined transfer rule. Funds are released atomically to client addresses; if staking is enabled, provable protocol violations can trigger slashing. Prototype status and scope.We implemented a proof-of-concept KFCA contract in Solidity for the EVM 2. A full decentralize...

work page