arxiv: 2605.10499 · v2 · submitted 2026-05-11 · 💻 cs.DC

Recognition: no theorem link

Privacy-preserving Chunk Scheduling in a BitTorrent Implementation of Federated Learning

Naicheng Li , Javad Dogani , Rui Wang , Kaitai Liang , Nikolaos Laoutaris

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:48 UTC · model grok-4.3

classification 💻 cs.DC

keywords federated learningBitTorrentprivacy preservationdecentralized learningchunk schedulingsource unlinkabilityP2P disseminationserverless FL

0 comments

The pith

A short warm-up phase with non-owner-first chunk scheduling in BitTorrent hides federated learning update sources from local observers while keeping FedAvg aggregation intact by round end.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FLTorrent, a dissemination layer that replaces the central server in federated learning with a BitTorrent-style peer-to-peer network. It adds a brief warm-up using pre-round obfuscation, randomized lags, and coordination-only scheduling that sends non-owner chunks first. This design upper-bounds the chance a local observer can link a chunk to its source by the fraction of owner chunks in any sender's eligible set. The result is that attribution success falls close to neighborhood-level random guessing for typical nodes, the bound tightens as network size grows, and the system stays robust even if several peers collude. Experiments show the warm-up adds roughly 12 percent to each round and the full system adds only 6-10 percent overhead relative to plain BitTorrent even at LLM-scale model sizes over multi-gigabit links.

Core claim

Under an observation-only local adversary, FLTorrent drives attribution success close to neighborhood-level random guessing for typical nodes, improves with network size, and remains robust under collusion, while a GreedyFastestFirst heuristic reaches about 92 percent of a bandwidth-optimal max-flow bound and the warm-up stays a stable 12 percent share of round time across 100-500 peers.

What carries the argument

The warm-up phase that performs pre-round obfuscation and non-owner-first scheduling with the tracker off the data path, before handing control to vanilla BitTorrent swarming.

If this is right

Attribution success rate falls toward random guessing and continues to improve as the number of peers increases.
The scheme stays effective even when several nodes collude to pool observations.
A simple greedy heuristic achieves 92 percent of the bandwidth-optimal dissemination rate.
Total round-time overhead stays between 6 and 10 percent relative to plain BitTorrent at 7-10 Gbps access speeds.
The warm-up phase occupies a stable 12 percent of each round across networks of 100 to 500 peers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same warm-up pattern could be reused in other peer-to-peer ML dissemination settings that need both fast mixing and source hiding.
Because the unlinkability guarantee is stated in terms of cover-set fractions, it may be possible to tune the warm-up length against measured bandwidth to hit a target privacy level without re-running full experiments.
The approach suggests a practical way to add within-round unlinkability on top of existing content protections such as differential privacy without changing the aggregation rule.

Load-bearing premise

The warm-up phase preserves enough global information that FedAvg-style aggregation semantics remain intact by the round deadline despite deliberate early non-owner mass and randomized lags.

What would settle it

Run a controlled deployment with at least 100 nodes, have a local observer record all transfers, and check whether the fraction of correctly attributed owner chunks exceeds the neighborhood random-guessing baseline by more than a few percentage points.

Figures

Figures reproduced from arXiv: 2605.10499 by Javad Dogani, Kaitai Liang, Naicheng Li, Nikolaos Laoutaris, Rui Wang.

**Figure 2.** Figure 2: Convergence on MNIST and CIFAR-10 under IID and [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 5.** Figure 5: Warm-up duration as the warm-up threshold [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 3.** Figure 3: Warm-up bandwidth utilization for online heuristics vs. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 2.** Figure 2: Warm-up bandwidth utilization for online heuristics vs. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 4.** Figure 4: End-to-end time decomposition under privacy ablations [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Privacy ablation: maximum ASR under three inference [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Privacy ablation: maximum ASR under three inference [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: ASR under warm-up defenses. chance that at least one succeeds rises from 13.56% (a=5) to 30.82% (a=25), but per-attacker ASR stays low, 11.31%– 14.32%, indicating resilience to observation-only collusion. E. LLM-Scale Round-Time Overhead We use LLM-scale runs as systems stress tests for dissemination: they validate end-to-end relative overheads under very large artifacts, without requiring the (much costl… view at source ↗

read the original abstract

Traditional federated learning (FL) relies on a central aggregator server, which can create performance bottlenecks and privacy risks. Decentralized mix-and-forward designs remove the server, but repeated local mixing can attenuate global information under heterogeneity and expose peer-to-peer neighborhoods as a privacy attack surface. To preserve FedAvg-style aggregation semantics over updates reconstructable by the round deadline while scaling dissemination, we present FLTorrent, a BitTorrent-based dissemination layer for serverless FL with a short warm-up. Warm-up hardens within-round source unlinkability, a dissemination-layer goal orthogonal to content protections such as DP or secure aggregation, via pre-round obfuscation, randomized lags, and coordination-only non-owner-first scheduling with the tracker off the data path, before switching to vanilla BitTorrent swarming. We upper-bound the per-transfer attribution posterior by the fraction of owner chunks in a sender's eligible cover set, and derive a tighter high-probability bound that improves with early non-owner mass. A simple heuristic, GreedyFastestFirst, attains about 92% of a bandwidth-optimal max-flow upper bound, while warm-up remains a stable about 12% share of a round across 100-500 peers. Under an observation-only local adversary, FLTorrent drives attribution success close to neighborhood-level random guessing for typical nodes, improves with network size, and remains robust under collusion. In LLM-scale dissemination stress tests over 7-10 Gbps access links, FLTorrent adds only about 6-10% round-time overhead relative to BitTorrent-only. Overall, FLTorrent shows that within-round unlinkability and BitTorrent-level efficiency can co-exist with predictable, low overheads at scale.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FLTorrent layers a short warm-up on BitTorrent to reduce chunk attribution in decentralized FL while hitting 92% of max-flow efficiency, but the posterior bounds may weaken under the speed-based heuristic.

read the letter

The main point is that this paper builds FLTorrent as a BitTorrent dissemination layer for serverless federated learning. It adds a brief warm-up with non-owner-first scheduling, randomized lags, and tracker coordination to harden source unlinkability before falling back to standard swarming. The result is claimed to keep attribution posteriors near neighborhood random guessing while adding only 6-10% round-time overhead and staying at 92% of a max-flow bound across 100-500 peers and LLM-scale tests.

Referee Report

2 major / 2 minor

Summary. The paper proposes FLTorrent, a BitTorrent-based dissemination layer for serverless federated learning. It introduces a short warm-up phase using non-owner-first scheduling, randomized lags, and coordination-only obfuscation to harden within-round source unlinkability, derives upper bounds on per-transfer attribution posteriors under an observation-only local adversary, and shows via simulation that a GreedyFastestFirst heuristic attains 92% of a max-flow upper bound while adding only 6-10% round-time overhead.

Significance. If the attribution bounds are shown to hold under the actual heuristic and the warm-up is proven to preserve FedAvg aggregation semantics, the work would be significant for demonstrating that unlinkability and near-optimal efficiency can coexist in decentralized FL at scale without a central server.

major comments (2)

[Attribution posterior bound derivation] The upper bound on the per-transfer attribution posterior (stated in the abstract and derived from the scheduling rules and cover-set definition) assumes uniform random cover sets. However, the GreedyFastestFirst heuristic selects chunks by speed and therefore correlates owner-chunk assignment with bandwidth under heterogeneous conditions, violating the uniformity needed for the high-probability bound that improves with early non-owner mass. This assumption is load-bearing for the central claim that attribution success approaches neighborhood-level random guessing.
[Warm-up phase and aggregation semantics] The claim that the warm-up phase (reported as a stable ~12% share of the round) preserves sufficient global information for FedAvg-style aggregation by the round deadline, despite deliberate early non-owner mass and randomized lags, is stated but not accompanied by a detailed argument or convergence analysis showing that information loss remains negligible.

minor comments (2)

[Abstract and experimental results] The concrete performance figures (92% of max-flow, 12% warm-up share, 6-10% overhead) are presented without reported variance, number of runs, or confidence intervals; adding these would strengthen the experimental claims.
[Notation and definitions] Notation for cover sets, eligible senders, and the exact definition of the high-probability bound should be introduced with a short table or diagram to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the assumptions underlying our theoretical bounds and the preservation of aggregation semantics. We respond to each major comment below.

read point-by-point responses

Referee: [Attribution posterior bound derivation] The upper bound on the per-transfer attribution posterior (stated in the abstract and derived from the scheduling rules and cover-set definition) assumes uniform random cover sets. However, the GreedyFastestFirst heuristic selects chunks by speed and therefore correlates owner-chunk assignment with bandwidth under heterogeneous conditions, violating the uniformity needed for the high-probability bound that improves with early non-owner mass. This assumption is load-bearing for the central claim that attribution success approaches neighborhood-level random guessing.

Authors: The basic upper bound on the per-transfer attribution posterior follows directly from the fraction of owner chunks in a sender's eligible cover set and the scheduling rules; it holds by construction regardless of how cover sets are populated. The tighter high-probability bound is derived under an explicit assumption of uniform random cover sets to illustrate improvement with early non-owner mass. We acknowledge that GreedyFastestFirst may introduce correlations between owner-chunk assignment and bandwidth in heterogeneous settings, so the high-probability bound does not apply verbatim to the heuristic. Our central empirical claim, however, rests on simulation results under heterogeneous bandwidth that show attribution success approaching neighborhood-level random guessing. We will revise the manuscript to separate the assumptions clearly, state that the high-probability bound applies to the idealized random case, and add a short discussion of the heuristic's practical effect, supported by the existing simulation data. This will be a partial revision. revision: partial
Referee: [Warm-up phase and aggregation semantics] The claim that the warm-up phase (reported as a stable ~12% share of the round) preserves sufficient global information for FedAvg-style aggregation by the round deadline, despite deliberate early non-owner mass and randomized lags, is stated but not accompanied by a detailed argument or convergence analysis showing that information loss remains negligible.

Authors: The warm-up phase uses non-owner-first scheduling and randomized lags only for the initial dissemination window; the system then switches to standard BitTorrent swarming, which guarantees that every peer receives the complete set of chunks by the round deadline. Because FedAvg aggregation is performed on the full model update once all chunks have arrived, the early non-owner mass and lags affect only the timing and source obfuscation of transfers, not the final information content. Our experiments confirm that all peers complete dissemination on schedule across the tested scales. We agree that an explicit argument would strengthen the presentation. We will add a concise subsection explaining that the warm-up is a fixed, small fraction of the round (~12%) and that subsequent vanilla swarming ensures zero information loss at the deadline, thereby preserving FedAvg semantics. This will be incorporated as a revision. revision: yes

Circularity Check

0 steps flagged

Attribution bounds and efficiency claims derived independently from scheduling rules and external bounds

full rationale

The paper states an upper bound on per-transfer attribution posterior explicitly as the fraction of owner chunks in a sender's eligible cover set, followed by a high-probability tightening that improves with early non-owner mass; both are presented as direct consequences of the warm-up scheduling rules rather than fitted to the same simulation data used for the final attribution-success claim. The 92% optimality figure is reported as a simulation outcome against an external bandwidth-optimal max-flow upper bound. No load-bearing step reduces by construction to a self-citation, a fitted parameter renamed as prediction, or an ansatz smuggled via prior work. The central unlinkability result is therefore self-contained against the stated observation-only adversary model and does not collapse to its inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The design rests on standard peer-to-peer network assumptions and an observation-only local adversary model; no new physical entities are postulated and the only measured quantity (warm-up share) is reported as stable rather than fitted to achieve the target result.

free parameters (1)

warm-up duration fraction
Observed to stabilize near 12% of round time across 100-500 peers; used to set the switch to vanilla BitTorrent.

axioms (2)

domain assumption Adversary observes only forwarded chunks and has no access to content or timing side channels beyond the dissemination layer
Invoked when claiming attribution success approaches random guessing.
domain assumption FedAvg aggregation semantics remain intact provided all updates arrive by the round deadline
Used to justify that the warm-up does not break the learning algorithm.

pith-pipeline@v0.9.0 · 5614 in / 1561 out tokens · 44709 ms · 2026-05-15T05:48:58.502197+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

[1]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inAISTATS, 2017

work page 2017
[2]

Incentive mechanism design for unbiased federated learning with randomized client participation,

B. Luo, Y . Feng, S. Wang, J. Huang, and L. Tassiulas, “Incentive mechanism design for unbiased federated learning with randomized client participation,” inICDCS, 2023

work page 2023
[3]

Embedding communication for federated graph neural networks with privacy guarantees,

X. Wu, Z. Ji, and C.-L. Wang, “Embedding communication for federated graph neural networks with privacy guarantees,” inICDCS, 2023

work page 2023
[4]

Fedlth: A privacy-preserving federated learning framework with model pruning on edge clients,

H. Zhang, Y . Xie, S. Hu, M. He, P. He, J. Zheng, and D. Feng, “Fedlth: A privacy-preserving federated learning framework with model pruning on edge clients,” inICDCS, 2025

work page 2025
[5]

Accelerating and securing federated learning with stateless in-network aggregation at the edge,

J. Xia, W. Wu, L. Luo, G. Cheng, D. Guo, and Q. Nian, “Accelerating and securing federated learning with stateless in-network aggregation at the edge,” inICDCS, 2024

work page 2024
[6]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” inIEEE S&P, 2017

work page 2017
[7]

Deep leakage from gradients,

L. Zhu, Z. Liu, and S. Han, “Deep leakage from gradients,”NeurIPS, 2019

work page 2019
[8]

Inverting gradients-how easy is it to break privacy in federated learning?,

J. Geiping, H. Bauermeister, H. Dr ¨oge, and M. Moeller, “Inverting gradients-how easy is it to break privacy in federated learning?,” NeurIPS, 2020

work page 2020
[9]

Cutting through privacy: A hyperplane-based data reconstruction attack in federated learning,

F. Diana, A. Nusser, C. Xu, and G. Neglia, “Cutting through privacy: A hyperplane-based data reconstruction attack in federated learning,”

work page
[10]

Practical secure aggregation for privacy-preserving machine learning,

K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” inACM CCS, 2017

work page 2017
[11]

Eluding secure aggregation in federated learning via model inconsistency,

D. Pasquini, D. Francati, and G. Ateniese, “Eluding secure aggregation in federated learning via model inconsistency,” inACM CCS, 2022

work page 2022
[12]

Secure aggregation is not private against membership inference attacks,

K.-H. Ngo, J. ¨Ostman, G. Durisi, and A. Graell i Amat, “Secure aggregation is not private against membership inference attacks,” 2024. arXiv:2403.17775

work page arXiv 2024
[13]

Stochastic gradient push for distributed deep learning,

M. Assran, N. Loizou, N. Ballas, and M. Rabbat, “Stochastic gradient push for distributed deep learning,” inICML, 2019

work page 2019
[14]

A unified theory of decentralized sgd with changing topology and local updates,

A. Koloskova, N. Loizou, S. Boreiri, M. Jaggi, and S. Stich, “A unified theory of decentralized sgd with changing topology and local updates,” inICML, 2020

work page 2020
[15]

Decentralized federated learning: A survey and perspective,

L. Yuan, Z. Wang, L. Sun, P. S. Yu, and C. G. Brinton, “Decentralized federated learning: A survey and perspective,”IEEE Internet Things J., vol. 11, no. 21, 2024

work page 2024
[16]

On the (in)security of peer- to-peer decentralized machine learning,

D. Pasquini, M. Raynal, and C. Troncoso, “On the (in)security of peer- to-peer decentralized machine learning,” inIEEE S&P, 2023

work page 2023
[17]

Incentives build robustness in bittorrent,

B. Cohen, “Incentives build robustness in bittorrent,” inWorkshop on Economics of P2P Systems, 2003

work page 2003
[18]

Deep diving into bittorrent locality,

R. Cuevas, N. Laoutaris, X. Yang, G. Siganos, and P. Rodriguez, “Deep diving into bittorrent locality,”ACM SIGMETRICS Perform. Eval. Rev., vol. 38, no. 1, 2010

work page 2010
[19]

Source inference attacks in federated learning,

H. Hu, Z. Salcic, L. Sun, G. Dobbie, and X. Zhang, “Source inference attacks in federated learning,” inICDM, 2021

work page 2021
[20]

Where does this data come from? enhanced source inference attacks in federated learning,

H. Chen, X. Xu, X. Zhu, X. Zhou, F. Dai, Y . Gao, X. Chen, S. Wang, and H. Hu, “Where does this data come from? enhanced source inference attacks in federated learning,” inIJCAI, 2025

work page 2025
[21]

Quality inference in federated learning with secure aggregation,

B. Pej ´o and G. Bicz ´ok, “Quality inference in federated learning with secure aggregation,”IEEE Trans. Big Data, vol. 9, no. 5, 2023

work page 2023
[22]

Bep 0003: The bittorrent protocol specification

BitTorrent Enhancement Proposals, “Bep 0003: The bittorrent protocol specification.” https://www.bittorrent.org/beps/bep 0003.html, 2003

work page 2003
[23]

Np-complete scheduling problems,

J. D. Ullman, “Np-complete scheduling problems,”J. Comput. Syst. Sci., vol. 10, no. 3, 1975

work page 1975
[24]

M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979

work page 1979
[25]

If-cnn: Image-aware inference framework for cnn with the collaboration of mobile devices and cloud,

G. Shu, W. Liu, X. Zheng, and J. Li, “If-cnn: Image-aware inference framework for cnn with the collaboration of mobile devices and cloud,” IEEE Access, vol. 6, 2018

work page 2018
[26]

“Oecd.” https://www.oecd.org, 2024

work page 2024
[27]

Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,

X. Lian, C. Zhang, H. Zhang, C.-J. Hsieh, W. Zhang, and J. Liu, “Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,”NeurIPS, 2017

work page 2017
[28]

Decentralized stochastic opti- mization and gossip algorithms with compressed communication,

A. Koloskova, S. Stich, and M. Jaggi, “Decentralized stochastic opti- mization and gossip algorithms with compressed communication,” in ICML, 2019

work page 2019
[29]

Relaysum for decentralized deep learning on heterogeneous data,

T. V ogels, L. He, A. Koloskova, S. P. Karimireddy, T. Lin, S. U. Stich, and M. Jaggi, “Relaysum for decentralized deep learning on heterogeneous data,” inNeurIPS, 2021

work page 2021
[30]

Exponential graph is provably efficient for decentralized deep training,

B. Ying, K. Yuan, Y . Chen, H. Hu, P. Pan, and W. Yin, “Exponential graph is provably efficient for decentralized deep training,”NeurIPS, 2021

work page 2021
[31]

Get more for less in decentralized learning systems,

A. Dhasade, A.-M. Kermarrec, R. Pires, R. Sharma, M. Vujasinovic, and J. Wigger, “Get more for less in decentralized learning systems,” in ICDCS, 2023

work page 2023
[32]

bbtopk: Bandwidth-aware sparse allre- duce with blocked sparsification for efficient distributed training,

C. Chen, M. Li, and C. Yang, “bbtopk: Bandwidth-aware sparse allre- duce with blocked sparsification for efficient distributed training,” in ICDCS, 2023

work page 2023
[33]

Bittorrent-based gossip learning,

O. Carl and T. Weis, “Bittorrent-based gossip learning,” inACM IoT, 2024

work page 2024
[34]

Spying the world from your laptop: identifying and profiling content providers and big downloaders in bittorrent,

S. Le Blond, A. Legout, F. Lefessant, W. Dabbous, and M. A. Kaafar, “Spying the world from your laptop: identifying and profiling content providers and big downloaders in bittorrent,” inUSENIX LEET, 2010

work page 2010
[35]

Anofel: Supporting anonymity for privacy-preserving federated learning,

G. Almashaqbeh and Z. Ghodsi, “Anofel: Supporting anonymity for privacy-preserving federated learning,”PoPETs, 2025

work page 2025
[36]

Anonymous federated learning via named-data networking,

A. Agiollo, E. Bardhi, M. Conti, N. Dal Fabbro, and R. Lazzeretti, “Anonymous federated learning via named-data networking,”Future Gener. Comput. Syst., vol. 152, 2024

work page 2024