Silent Failures in Federated Personalization of Foundation Models

Alex Bui; YongKyung Oh

arxiv: 2606.00947 · v1 · pith:RQSFFFOHnew · submitted 2026-05-31 · 💻 cs.LG · cs.AI

Silent Failures in Federated Personalization of Foundation Models

YongKyung Oh , Alex Bui This is my paper

Pith reviewed 2026-06-28 18:00 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords federated learningfoundation modelstrustworthinesssilent failurespersonalizationbias amplificationfairnessalignment

0 comments

The pith

Federated personalization of foundation models creates silent failures in bias, fairness, and alignment that privacy rules make hard to detect.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that personalizing foundation models on private decentralized data via federated learning, under regulatory demands for monitoring, produces a distinct class of trustworthiness problems called silent failures. These encompass amplified bias, fairness collapse, and alignment erosion that stay hidden because privacy protections block access to model behavior. Existing benchmarks split into two incompatible groups: federated ones track system metrics but miss behavioral issues, while centralized ones need direct model access that violates privacy. The authors define six specific failure modes that stem from combining personalization, dataset shifts, and federated limits, and conclude that privacy-preserving methods by themselves do not deliver trustworthy systems. They outline a research agenda focused on behavioral evaluation that respects privacy constraints.

Core claim

The convergence of federated learning for foundation model personalization with post-market monitoring requirements produces an under-recognized category of trustworthiness failures termed silent failures. These include amplified bias, fairness collapse, and alignment erosion that remain difficult to detect because federated privacy constraints limit visibility into model behavior. A landscape analysis shows a structural divide between federated benchmarks, which evaluate system performance with little behavioral insight, and centralized trustworthiness benchmarks, which require model access incompatible with privacy. A taxonomy of six silent failure modes arises from the interaction of foun

What carries the argument

The taxonomy of six silent failure modes arising from the interaction of foundation model personalization, dataset shift, and core federated constraints.

If this is right

Amplified bias can accumulate in personalized models without external notice.
Fairness properties can degrade across clients due to local data shifts.
Model alignment with intended values can erode while the system appears functional.
Standard performance benchmarks miss these behavioral problems.
Privacy-preserving training must be paired with new evaluation methods to achieve trustworthiness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Regulatory requirements for post-deployment monitoring may require entirely new technical approaches beyond current federated methods.
The same visibility limits could affect trustworthiness assessment in other privacy-focused distributed training settings.
Synthetic or proxy-based tests for behavioral drift might serve as practical starting points for the proposed evaluation agenda.

Load-bearing premise

The assumption that privacy constraints in federated learning inherently limit visibility into model behavior enough to prevent detection of issues like bias amplification.

What would settle it

An empirical case in which a federated personalization system is shown to allow detection of behavioral failures such as increased bias or fairness collapse without violating privacy constraints, or a benchmark that successfully measures model behavior under federated limits.

Figures

Figures reproduced from arXiv: 2606.00947 by Alex Bui, YongKyung Oh.

read the original abstract

Foundation models are increasingly personalized on decentralized private data through federated learning and are now deployed at scale under growing regulatory requirements for post-market monitoring. We argue that this convergence creates a distinct and under-recognized class of trustworthiness failures, which we term "Silent Failures." These include amplified bias, fairness collapse, and alignment erosion that may remain difficult to detect because federated learning's privacy constraints limit visibility into model behavior. A landscape analysis of existing benchmarks reveals a structural divide. Federated benchmarks evaluate system performance but provide limited insight into model behavior, whereas centralized trustworthiness benchmarks assess behavior but require model access incompatible with federated privacy. We introduce a taxonomy of six silent failure modes arising from the interaction of foundation model personalization, dataset shift, and core federated constraints. Our analysis shows that privacy-preserving training alone is insufficient for trustworthy deployment. We conclude with a research agenda for privacy-preserving behavioral evaluation and propose that silent failures become a standard diagnostic category for trustworthy federated artificial intelligence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A position paper that names silent failures in federated FM personalization and sketches a taxonomy, but rests entirely on argument without data or experiments.

read the letter

This paper flags a potential gap at the intersection of federated learning and foundation model personalization. It argues that privacy constraints can hide failures like bias amplification, fairness collapse, and alignment erosion, and it offers a taxonomy of six modes plus a research agenda for behavioral evaluation under those constraints.

What stands out is the framing of the benchmark divide: federated benchmarks focus on system metrics while centralized trustworthiness ones need model access that privacy rules block. That synthesis is new enough to be worth noting, and the paper does a clean job laying out why privacy-preserving training by itself may not cover trustworthiness in regulated deployments.

The main limitation is that the work stays conceptual. The landscape analysis is asserted rather than shown with concrete benchmark comparisons or failure examples. No experiments, no case studies, and no formal derivation back the claim that these failures are widespread or distinct. The taxonomy is presented as a synthesis, but without evidence it functions more as a hypothesis generator than a demonstrated result.

The paper is aimed at people working on trustworthy federated systems and post-deployment monitoring. It deserves a serious referee because the issue it raises is concrete and timely for anyone shipping personalized models under privacy rules, even if the current version is light on support. I would send it out for review to see whether the community can turn the taxonomy into testable claims.

Referee Report

1 major / 2 minor

Summary. The paper claims that federated personalization of foundation models, combined with regulatory monitoring needs, creates an under-recognized class of trustworthiness failures termed 'Silent Failures' (amplified bias, fairness collapse, alignment erosion). These are difficult to detect due to privacy constraints in federated learning. A landscape analysis of benchmarks reveals a structural divide between system-focused federated benchmarks and behavior-focused centralized ones. The paper introduces a taxonomy of six silent failure modes arising from personalization, dataset shift, and federated constraints, concludes that privacy-preserving training alone is insufficient, and proposes a research agenda for privacy-preserving behavioral evaluation.

Significance. If the taxonomy and structural-divide argument hold, the work provides a useful conceptual framework for identifying gaps in evaluating personalized foundation models under privacy constraints. It could help shape standards and diagnostics for trustworthy federated AI. The explicit research agenda is a constructive contribution, though the absence of empirical validation or formal derivations means the significance is primarily in problem framing rather than demonstrated results.

major comments (1)

[Abstract / taxonomy introduction] The central claim that privacy-preserving training is insufficient rests on the taxonomy of six failure modes and the landscape analysis (Abstract). Without explicit listing or derivation of the six modes from the interaction of foundation-model personalization, dataset shift, and federated constraints, or concrete illustrations showing they are distinct from previously documented issues, the load-bearing argument remains an assertion rather than a substantiated classification.

minor comments (2)

[Abstract] The landscape analysis is invoked to establish the benchmark divide but provides no named benchmarks, selection criteria, or quantitative summary; adding a table or enumerated list would strengthen the motivation without altering the conceptual nature.
Terminology such as 'Silent Failures' is introduced as novel; a brief comparison paragraph distinguishing it from related concepts (e.g., undetected distribution shift in FL) would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for this constructive comment on strengthening the presentation of our taxonomy. We address the point directly below.

read point-by-point responses

Referee: [Abstract / taxonomy introduction] The central claim that privacy-preserving training is insufficient rests on the taxonomy of six failure modes and the landscape analysis (Abstract). Without explicit listing or derivation of the six modes from the interaction of foundation-model personalization, dataset shift, and federated constraints, or concrete illustrations showing they are distinct from previously documented issues, the load-bearing argument remains an assertion rather than a substantiated classification.

Authors: We agree the abstract states the existence of the taxonomy at a high level without enumeration. The full manuscript derives the six modes in Section 3 by tracing each to the specific interaction of (i) foundation-model personalization on client data, (ii) dataset shift across clients, and (iii) federated constraints that preclude direct inspection or centralized auditing. Concrete distinctions from prior work (e.g., standard bias amplification in centralized training or known FL system failures) are provided via examples such as local overfitting that silently amplifies subgroup bias only visible after aggregation. To address the referee's concern, we will revise both the abstract and the opening of Section 3 to include an explicit numbered list of the six modes together with a one-paragraph derivation sketch and a short table contrasting each with documented issues. This change makes the classification immediately verifiable while preserving the paper's framing focus. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is a conceptual position paper that introduces terminology ('Silent Failures') and a research agenda without equations, derivations, fitted parameters, or any load-bearing technical steps. The landscape analysis of benchmarks is offered as motivation for the taxonomy rather than a verified quantitative finding, and no self-citation chains or ansatzes reduce claims to prior inputs by construction. The central argument is framed explicitly as an argument and taxonomy, not as a computed output.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper rests on domain assumptions about privacy constraints and introduces a new conceptual category without quantitative parameters, formal proofs, or external empirical anchors visible in the abstract.

axioms (1)

domain assumption Federated learning's privacy constraints limit visibility into model behavior
Invoked in the abstract to explain why silent failures remain difficult to detect.

invented entities (1)

Silent Failures no independent evidence
purpose: To name and categorize a class of trustworthiness issues in federated foundation model personalization that evade standard detection
New term and taxonomy introduced to organize the described problems; no independent falsifiable evidence provided in the abstract.

pith-pipeline@v0.9.1-grok · 5690 in / 1311 out tokens · 26827 ms · 2026-06-28T18:00:04.971344+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 33 canonical work pages · 3 internal anchors

[1]

Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, et al. 2022. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. doi:10.48550/arXiv.2204.05862

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2204.05862 2022
[2]

Simone Balloccu, Patrícia Schmidtová, Mateusz Lango, and Ondrej Dusek. 2024. Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed- Source LLMs. InProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Yvette Graham and Matthew Purver (Eds.). Association ...

work page doi:10.18653/v1/2024.eacl-long.5 2024
[3]

Ahmed, and Andreas Kassler

Firas Bayram, Bestoun S. Ahmed, and Andreas Kassler. 2022. From concept drift to model degradation: An overview on performance-aware drift detectors. Knowledge-Based Systems245 (June 2022), 108632. doi:10.1016/j.knosys.2022. 108632

work page doi:10.1016/j.knosys.2022 2022
[4]

On the Opportunities and Risks of Foundation Models

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the Opportunities and Risks of Foundation Models. doi:10.48550/arXiv.2108.07258

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.07258 2021
[5]

Brendan McMahan, Virginia Smith, and Ameet Talwalkar

Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, and Ameet Talwalkar. 2018. LEAF: A Benchmark for Federated Settings. doi:10.48550/arXiv.1812.01097

work page doi:10.48550/arxiv.1812.01097 2018
[6]

Hongyan Chang and Reza Shokri. 2023. Bias Propagation in Federated Learning.. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

2023
[7]

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al . 2024. A Survey on Evaluation of Large Language Models.ACM Trans. Intell. Syst. Technol.15, 3 (2024), Article 39

2024
[8]

Pappas, Florian Tramèr, et al

Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramèr, et al. 2024. JailbreakBench: An Open Robustness Bench- mark for Jailbreaking Large Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D....

work page doi:10.52202/079017-1745 2024
[9]

Huiqiang Chen, Tianqing Zhu, Tao Zhang, Wanlei Zhou, and Philip S. Yu. 2023. Privacy and Fairness in Federated Learning: On the Perspective of Tradeoff.ACM Comput. Surv.56, 2 (2023), Article 39

2023
[10]

Yae Jee Cho, Luyang Liu, Zheng Xu, Aldi Fahrezi, and Gauri Joshi. 2024. Het- erogeneous LoRA for Federated Fine-tuning of On-Device Foundation Mod- els. InProceedings of the 2024 Conference on Empirical Methods in Natural Lan- guage Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguistics, Miami, Flor...

work page doi:10.18653/v1/2024.emnlp-main.717 2024
[11]

Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Aleš Leonardis, Gregory Slabaugh, and Tinne Tuytelaars. 2022. A Continual Learning Survey: Defying Forgetting in Classification Tasks.IEEE Transactions on Pattern Analysis and Machine Intelligence44, 7 (July 2022), 3366–3385. doi:10.1109/TPAMI. 2021.3057446

work page doi:10.1109/tpami 2022
[12]

Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruk- sachatkun, Kai-Wei Chang, and Rahul Gupta. 2021. BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, Virtual Event, Canada, 862–872

2021
[13]

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. InProceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS ’12). Association for Computing Machinery, New York, NY, USA, 214–226. doi:10.1145/2090236.2090255

work page doi:10.1145/2090236.2090255 2012
[14]

Tao Fan, Hanlin Gu, Xuemei Cao, Chee Seng Chan, Qian Chen, Yiqiang Chen, Yihui Feng, Yang Gu, Jiaxiang Geng, Bing Luo, et al . 2025. Ten Challenging Problems in Federated Foundation Models.IEEE Transactions on Knowledge and Data Engineering37, 7 (July 2025), 4314–4337. doi:10.1109/TKDE.2025.3555328

work page doi:10.1109/tkde.2025.3555328 2025
[15]

Deng, Zachary C

Michael Feffer, Anusha Sinha, Wesley H. Deng, Zachary C. Lipton, and Hoda Heidari. 2024. Red-Teaming for Generative AI: Silver Bullet or Security Theater? Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society7, 1 (Oct. 2024), 421–437. doi:10.1609/aies.v7i1.31647

work page doi:10.1609/aies.v7i1.31647 2024
[16]

Deep Ganguli, Danny Hernandez, Liane Lovitt, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Nova Dassarma, Dawn Drain, Nelson Elhage, et al . 2022. Predictability and Surprise in Large Generative Models. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, Seoul, Republic of Korea...

2022
[17]

Scamarcia, Javier Fernandez-Marques, Mohammad Naseri, Chong Ng, Dimitris Stripelis, Zexi Li, Tao Shen, Jiamu Bai, Daoyuan Chen, et al

Yan Gao, Massimo R. Scamarcia, Javier Fernandez-Marques, Mohammad Naseri, Chong Ng, Dimitris Stripelis, Zexi Li, Tao Shen, Jiamu Bai, Daoyuan Chen, et al
[18]

InAdvances in Neural Information Processing Systems, D

FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models. InAdvances in Neural Information Processing Systems, D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen (Eds.), Vol. 38. Curran Associates, Inc
[19]

Shahriar Golchin and Mihai Surdeanu. 2024. Time Travel in LLMs: Tracing Data Contamination in Large Language Models.. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

2024
[20]

Weinberger

Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017. On Calibration of Modern Neural Networks. InProceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 1321–1330

2017
[21]

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. 2021. The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generaliza- tion.. In2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. 83...

work page doi:10.1109/iccv48922 2021
[22]

Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models.. InThe Tenth International Conference on Learning Representa- tions, ICLR 2022, Virtual Event, April 25-29, 2022

2022
[23]

Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, et al. 2024. Position: TrustLLM: Trustworthiness in Large Language Models. InProceedings of the 41st International Conference on Machine Learning, Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan...

2024
[24]

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. 2023. Survey of Hallucination in Natural Language Generation.ACM Comput. Surv.55, 12 (2023), Article 248

2023
[25]

Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cor- mode, Rachel Cummings, et al

Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cor- mode, Rachel Cummings, et al. 2021. Advances and Open Problems in Federated Learning.Foundations and Trends®in Machine Learning14, 1–2 (2021), 1–210. doi:10.1561/2200000083

work page doi:10.1561/2200000083 2021
[26]

Fan Lai, Yinwei Dai, Sanjay Singapuram, Jiachen Liu, Xiangfeng Zhu, Harsha Madhyastha, and Mosharaf Chowdhury. 2022. FedScale: Benchmarking Model and System Performance of Federated Learning at Scale. InProceedings of the 39th International Conference on Machine Learning, Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan...

2022
[27]

Changmao Li and Jeffrey Flanigan. 2024. Task Contamination: Language Models May Not Be Few-Shot Anymore.Proceedings of the AAAI Conference on Artificial Intelligence38, 16 (March 2024), 18471–18480. doi:10.1609/aaai.v38i16.29808

work page doi:10.1609/aaai.v38i16.29808 2024
[28]

Jinpeng Li, Hang Yu, Zhenyu Zhang, Xiangfeng Luo, and Shaorong Xie. 2024. Concept Drift Adaptation by Exploiting Drift Type.ACM Trans. Knowl. Discov. Data18, 4 (Feb. 2024), 96:1–96:22. doi:10.1145/3638777

work page doi:10.1145/3638777 2024
[29]

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated Optimization in Heterogeneous Networks. In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.), Vol. 2. 429–450

2020
[30]

Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michi- hiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, et al
[31]

Holistic Evaluation of Language Models.Transactions on Machine Learning Research(2023)

2023
[32]

Xinting Liao, Weiming Liu, Pengyang Zhou, Fengyuan Yu, Jiahe Xu, Jun Wang, Wenjie Wang, Chaochao Chen, and Xiaolin Zheng. 2024. FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection. In Advances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Ed...

2024
[33]

Jie Lu, Anjin Liu, Fan Dong, Feng Gu, João Gama, and Guangquan Zhang. 2019. Learning under Concept Drift: A Review.IEEE Transactions on Knowledge and Data Engineering31, 12 (Dec. 2019), 2346–2363. doi:10.1109/TKDE.2018.2876857

work page doi:10.1109/tkde.2018.2876857 2019
[34]

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. 2022. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Silent Failures in Federated Personalization of Foundation Models KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea Prompt Order Sensitivity. InProceedings of the 60th Annual Meeting of the ...

work page doi:10.18653/v1/2022.acl-long.556 2022
[35]

Lingjuan Lyu, Han Yu, Xingjun Ma, Chen Chen, Lichao Sun, Jun Zhao, Qiang Yang, and Philip S. Yu. 2024. Privacy and Robustness in Federated Learning: Attacks and Defenses.IEEE Transactions on Neural Networks and Learning Systems 35, 7 (July 2024), 8726–8746. doi:10.1109/TNNLS.2022.3216981

work page doi:10.1109/tnnls.2022.3216981 2024
[36]

Wuyuao Mai, Geng Hong, Pei Chen, Xudong Pan, Baojun Liu, Yuan Zhang, Haixin Duan, and Min Yang. 2025. You Can’t Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense. InProceedings of the ACM on Web Conference 2025. Association for Computing Machinery, Sydney NSW, Australia, 872–883

2025
[37]

Mohamad Mansouri, Melek Önen, and Wafa Ben Jaballah. 2022. Learning from Failures: Secure and Fault-Tolerant Aggregation for Federated Learning. InPro- ceedings of the 38th Annual Computer Security Applications Conference. ACM, Austin TX USA, 146–158. doi:10.1145/3564625.3568135

work page doi:10.1145/3564625.3568135 2022
[38]

Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, et al . 2024. HarmBench: a standardized evaluation framework for automated red teaming and robust refusal. InProceedings of the 41st International Conference on Machine Learning (ICML’24, Vol. 235). JMLR.org, Vienna, Austria, 35181–35224

2024
[39]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. InProceedings of the 20th International Conference on Artificial Intelligence and Statistics, Aarti Singh and Jerry Zhu (Eds.), Vol. 54. PMLR, Proceedings of Machine Learning Research, 1273–1282

2017
[40]

A unifying view on dataset shift in classification

Jose G. Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognition45, 1 (Jan. 2012), 521–530. doi:10.1016/j.patcog.2011.06.019

work page doi:10.1016/j.patcog.2011.06.019 2012
[41]

Jean Ogier du Terrail, Samy-Safwan Ayed, Edwige Cyffers, Felix Grimberg, Chaoyang He, Regis Loeb, Paul Mangold, Tanguy Marchand, Othmane Mar- foq, Erum Mushtaq, et al. 2022. FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings. InAdvances in Neural In- formation Processing Systems, S. Koyejo, S. Mohamed, A. Ag...

2022
[42]

Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul Choo, Alexander Hauptmann, Zhi-Qi Cheng, and Kyungwoo Song. 2024. Towards Calibrated Robust Fine-Tuning of Vision-Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curra...

2024
[43]

YongKyung Oh and Alex Bui. 2025. Multi-View Contrastive Learning for Ro- bust Domain Adaptation in Medical Time Series Analysis. InProceedings of the sixth Conference on Health, Inference, and Learning, Xuhai Orson Xu, Edward Choi, Pankhuri Singhal, Walter Gerych, Shengpu Tang, Monica Agrawal, Adarsh Subbaswamy, Elena Sizikova, Jessilyn Dunn, Roxana Danes...

2025
[44]

Mahdi Pakdaman Naeini, Gregory Cooper, and Milos Hauskrecht. 2015. Obtaining Well Calibrated Probabilities Using Bayesian Binning.Proceedings of the AAAI Conference on Artificial Intelligence29, 1 (Feb. 2015). doi:10.1609/aaai.v29i1.9602

work page doi:10.1609/aaai.v29i1.9602 2015
[45]

Jiaming Pei, Wenxuan Liu, Jinhai Li, Lukun Wang, and Chao Liu. 2024. A Review of Federated Learning Methods in Heterogeneous Scenarios.IEEE Transactions on Consumer Electronics70, 3 (Aug. 2024), 5983–5999. doi:10.1109/TCE.2024.3385440

work page doi:10.1109/tce.2024.3385440 2024
[46]

Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. 2022. Red Teaming Language Models with Language Models. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computa...

work page doi:10.18653/v1/2022.emnlp-main.225 2022
[47]

Krishna Pillutla, Kshitiz Malik, Abdel-Rahman Mohamed, Mike Rabbat, Maziar Sanjabi, and Lin Xiao. 2022. Federated Learning with Partial Model Personal- ization. InProceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Ste- fanie Jegelka, Le Song, Csaba Szepesvari, Gang...

2022
[48]

Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, and Pe- ter Henderson. 2024. Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

2024
[49]

A Survey on Foundation Models for Personalized Federated Intelligence

Yu Qiao, Huy Q. Le, Avi Deb Raha, Phuong-Nam Tran, Apurba Adhikary, Mengchun Zhang, Loc X. Nguyen, Eui-Nam Huh, Dusit Niyato, and Choong Seon Hong. 2025. Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence. doi:10.48550/ arXiv.2505.06907

work page internal anchor Pith review Pith/arXiv arXiv 2025
[50]

Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Bo Zhao, Liping Yi, Alysa Ziying Tan, Yulan Gao, Anran Li, Xiaoxiao Li, et al. 2025. Advances and Open Challenges in Federated Foundation Models.IEEE Communications Surveys & Tutorials(2025), 1–1. doi:10.1109/comst.2025.3552524

work page doi:10.1109/comst.2025.3552524 2025
[51]

Hashimoto, and Percy Liang

Shiori Sagawa, Pang Wei Koh, Tatsunori B. Hashimoto, and Percy Liang. 2020. Distributionally Robust Neural Networks.. In8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020

2020
[52]

Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, et al

Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, et al. 2023. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.Trans. Mach. Learn. Res.2023 (2023)

2023
[53]

Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, et al. 2023. Decod- ingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. In Advances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Glober- son, K. Saenko, M. Hardt, and S. Levine (Eds.), V...

2023
[54]

Papailiopoulos, and Yasaman Khazaeni

Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris S. Papailiopoulos, and Yasaman Khazaeni. 2020. Federated Learning with Matched Averaging.. In8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020

2020
[55]

Albert Webson and Ellie Pavlick. 2022. Do Prompt-Based Models Really Un- derstand the Meaning of Their Prompts?(Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Hu- man Language Technologies). Association for Computational Linguistics, Seattle, United States, 2300–2344

2022
[56]

Alexander Wei, Nika Haghtalab, and Jacob Steinhardt. 2023. Jailbroken: How Does LLM Safety Training Fail?. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 80079–80110

2023
[57]

Sierra Wyllie, Ilia Shumailov, and Nicolas Papernot. 2024. Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, Rio de Janeiro, Brazil, 2113–2147

2024
[58]

Zhao XU, Fan LIU, and Hao LIU. 2024. Bag of Tricks: Benchmarking of Jail- break Attacks on LLMs. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 32219–32250

2024
[59]

Guanhui Yang, Xiaoting Chen, Tengsen Zhang, Shuo Wang, and Yun Yang. 2023. An Impact Study of Concept Drift in Federated Learning. In2023 IEEE Interna- tional Conference on Data Mining (ICDM). 1457–1462. doi:10.1109/ICDM58522. 2023.00191

work page doi:10.1109/icdm58522 2023
[60]

Lei Yang, Jiaming Huang, Wanyu Lin, and Jiannong Cao. 2023. Personalized Federated Learning on Non-IID Data via Group-based Meta-learning.ACM Trans. Knowl. Discov. Data17, 4 (March 2023), Article 49. doi:10.1145/3558005

work page doi:10.1145/3558005 2023
[61]

Yiyuan Yang, Guodong Long, Tao Shen, Jing Jiang, and Michael Blumenstein
[62]

InAdvances in Neural Information Processing Systems, A

Dual-Personalizing Adapter for Federated Foundation Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 39409–39433
[63]

Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Zhibo Sun, and Yue Zhang
[64]

doi:10.1016/j.hcc.2024.100211

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly.High-Confidence Computing4, 2 (June 2024), 100211. doi:10.1016/j.hcc.2024.100211

work page doi:10.1016/j.hcc.2024.100211 2024
[65]

Yuen, and Dacheng Tao

Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, and Dacheng Tao. 2023. Hetero- geneous Federated Learning: State-of-the-art and Research Challenges.ACM Comput. Surv.56, 3 (2023), Article 79

2023
[66]

Rui Ye, Rui Ge, Xinyu Zhu, Jingyi Chai, Yaxin Du, Yang Liu, Yanfeng Wang, and Siheng Chen. 2024. FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc.,...

work page doi:10.52202/079017-3528 2024
[67]

Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, and Siheng Chen. 2024. OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24). Association for Computing Machinery, New York, NY, U...

work page doi:10.1145/3637528.3671582 2024
[68]

Baosheng Zhang, Yuchen Guo, Yipeng Li, Yuwei He, Haoqian Wang, and Qionghai Dai. 2022. Memory Recall: A Simple Neural Network Training Framework Against Catastrophic Forgetting.IEEE Transactions on Neural Networks and Learning Systems33, 5 (May 2022), 2010–2022. doi:10.1109/TNNLS.2021.3099700

work page doi:10.1109/tnnls.2021.3099700 2022
[69]

Jianqing Zhang, Yang Liu, Yang Hua, Hao Wang, Tao Song, Zhengui Xue, Ruhui Ma, and Jian Cao. 2025. PFLlib: A Beginner-Friendly and Comprehensive Person- alized Federated Learning Library and Benchmark.Journal of Machine Learning Research26, 50 (2025), 1–10

2025
[70]

Junyuan Zhang, Shuang Zeng, Miao Zhang, Runxi Wang, Feifei Wang, Yuyin Zhou, Paul Pu Liang, and Liangqiong Qu. 2024. FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning.. InIEEE/CVF Conference KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea YongKyung Oh and Alex Bui on Computer Vision and Pattern Recognition, CVPR 202...

work page doi:10.1109/cvpr52733.2024.01150 2024
[71]

Wenhao Zhang, Zimu Zhou, Yansheng Wang, and Yongxin Tong. 2023. DM-PFL: Hitchhiking Generic Federated Learning for Efficient Shift-Robust Personaliza- tion. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, Long Beach, CA, USA, 3396–3408

2023
[72]

Yifei Zhang, Dun Zeng, Jinglong Luo, Zenglin Xu, and Irwin King. 2023. A Survey of Trustworthy Federated Learning with Perspectives on Security, Robustness and Privacy. InCompanion Proceedings of the ACM Web Conference 2023. Association for Computing Machinery, Austin, TX, USA, 1167–1176

2023
[73]

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Zican Dong, Yupeng Hou, Beichen Zhang, Yingqian Min, Junjie Zhang, Peiyu Liu, et al. 2026. A Survey of Large Language Models.Frontiers of Computer Science20, 12 (May 2026), 2012627. doi:10.1007/s11704-026-60308-3

work page doi:10.1007/s11704-026-60308-3 2026
[74]

Yuang Zhao, Chuhan Wu, Qinglin Jia, Hong Zhu, Jia Yan, Libin Zong, Linxuan Zhang, Zhenhua Dong, and Muyu Zhang. 2024. Confidence-Aware Multi-Field Model Calibration. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Boise, ID, USA, 5111–5118. A Benchmark Paper Overviews Thi...

2024

[1] [1]

Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, et al. 2022. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. doi:10.48550/arXiv.2204.05862

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2204.05862 2022

[2] [2]

Simone Balloccu, Patrícia Schmidtová, Mateusz Lango, and Ondrej Dusek. 2024. Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed- Source LLMs. InProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Yvette Graham and Matthew Purver (Eds.). Association ...

work page doi:10.18653/v1/2024.eacl-long.5 2024

[3] [3]

Ahmed, and Andreas Kassler

Firas Bayram, Bestoun S. Ahmed, and Andreas Kassler. 2022. From concept drift to model degradation: An overview on performance-aware drift detectors. Knowledge-Based Systems245 (June 2022), 108632. doi:10.1016/j.knosys.2022. 108632

work page doi:10.1016/j.knosys.2022 2022

[4] [4]

On the Opportunities and Risks of Foundation Models

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the Opportunities and Risks of Foundation Models. doi:10.48550/arXiv.2108.07258

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.07258 2021

[5] [5]

Brendan McMahan, Virginia Smith, and Ameet Talwalkar

Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, and Ameet Talwalkar. 2018. LEAF: A Benchmark for Federated Settings. doi:10.48550/arXiv.1812.01097

work page doi:10.48550/arxiv.1812.01097 2018

[6] [6]

Hongyan Chang and Reza Shokri. 2023. Bias Propagation in Federated Learning.. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

2023

[7] [7]

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al . 2024. A Survey on Evaluation of Large Language Models.ACM Trans. Intell. Syst. Technol.15, 3 (2024), Article 39

2024

[8] [8]

Pappas, Florian Tramèr, et al

Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramèr, et al. 2024. JailbreakBench: An Open Robustness Bench- mark for Jailbreaking Large Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D....

work page doi:10.52202/079017-1745 2024

[9] [9]

Huiqiang Chen, Tianqing Zhu, Tao Zhang, Wanlei Zhou, and Philip S. Yu. 2023. Privacy and Fairness in Federated Learning: On the Perspective of Tradeoff.ACM Comput. Surv.56, 2 (2023), Article 39

2023

[10] [10]

Yae Jee Cho, Luyang Liu, Zheng Xu, Aldi Fahrezi, and Gauri Joshi. 2024. Het- erogeneous LoRA for Federated Fine-tuning of On-Device Foundation Mod- els. InProceedings of the 2024 Conference on Empirical Methods in Natural Lan- guage Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguistics, Miami, Flor...

work page doi:10.18653/v1/2024.emnlp-main.717 2024

[11] [11]

Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Aleš Leonardis, Gregory Slabaugh, and Tinne Tuytelaars. 2022. A Continual Learning Survey: Defying Forgetting in Classification Tasks.IEEE Transactions on Pattern Analysis and Machine Intelligence44, 7 (July 2022), 3366–3385. doi:10.1109/TPAMI. 2021.3057446

work page doi:10.1109/tpami 2022

[12] [12]

Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruk- sachatkun, Kai-Wei Chang, and Rahul Gupta. 2021. BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, Virtual Event, Canada, 862–872

2021

[13] [13]

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. InProceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS ’12). Association for Computing Machinery, New York, NY, USA, 214–226. doi:10.1145/2090236.2090255

work page doi:10.1145/2090236.2090255 2012

[14] [14]

Tao Fan, Hanlin Gu, Xuemei Cao, Chee Seng Chan, Qian Chen, Yiqiang Chen, Yihui Feng, Yang Gu, Jiaxiang Geng, Bing Luo, et al . 2025. Ten Challenging Problems in Federated Foundation Models.IEEE Transactions on Knowledge and Data Engineering37, 7 (July 2025), 4314–4337. doi:10.1109/TKDE.2025.3555328

work page doi:10.1109/tkde.2025.3555328 2025

[15] [15]

Deng, Zachary C

Michael Feffer, Anusha Sinha, Wesley H. Deng, Zachary C. Lipton, and Hoda Heidari. 2024. Red-Teaming for Generative AI: Silver Bullet or Security Theater? Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society7, 1 (Oct. 2024), 421–437. doi:10.1609/aies.v7i1.31647

work page doi:10.1609/aies.v7i1.31647 2024

[16] [16]

Deep Ganguli, Danny Hernandez, Liane Lovitt, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Nova Dassarma, Dawn Drain, Nelson Elhage, et al . 2022. Predictability and Surprise in Large Generative Models. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, Seoul, Republic of Korea...

2022

[17] [17]

Scamarcia, Javier Fernandez-Marques, Mohammad Naseri, Chong Ng, Dimitris Stripelis, Zexi Li, Tao Shen, Jiamu Bai, Daoyuan Chen, et al

Yan Gao, Massimo R. Scamarcia, Javier Fernandez-Marques, Mohammad Naseri, Chong Ng, Dimitris Stripelis, Zexi Li, Tao Shen, Jiamu Bai, Daoyuan Chen, et al

[18] [18]

InAdvances in Neural Information Processing Systems, D

FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models. InAdvances in Neural Information Processing Systems, D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen (Eds.), Vol. 38. Curran Associates, Inc

[19] [19]

Shahriar Golchin and Mihai Surdeanu. 2024. Time Travel in LLMs: Tracing Data Contamination in Large Language Models.. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

2024

[20] [20]

Weinberger

Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017. On Calibration of Modern Neural Networks. InProceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 1321–1330

2017

[21] [21]

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. 2021. The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generaliza- tion.. In2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. 83...

work page doi:10.1109/iccv48922 2021

[22] [22]

Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models.. InThe Tenth International Conference on Learning Representa- tions, ICLR 2022, Virtual Event, April 25-29, 2022

2022

[23] [23]

Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, et al. 2024. Position: TrustLLM: Trustworthiness in Large Language Models. InProceedings of the 41st International Conference on Machine Learning, Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan...

2024

[24] [24]

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. 2023. Survey of Hallucination in Natural Language Generation.ACM Comput. Surv.55, 12 (2023), Article 248

2023

[25] [25]

Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cor- mode, Rachel Cummings, et al

Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cor- mode, Rachel Cummings, et al. 2021. Advances and Open Problems in Federated Learning.Foundations and Trends®in Machine Learning14, 1–2 (2021), 1–210. doi:10.1561/2200000083

work page doi:10.1561/2200000083 2021

[26] [26]

Fan Lai, Yinwei Dai, Sanjay Singapuram, Jiachen Liu, Xiangfeng Zhu, Harsha Madhyastha, and Mosharaf Chowdhury. 2022. FedScale: Benchmarking Model and System Performance of Federated Learning at Scale. InProceedings of the 39th International Conference on Machine Learning, Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan...

2022

[27] [27]

Changmao Li and Jeffrey Flanigan. 2024. Task Contamination: Language Models May Not Be Few-Shot Anymore.Proceedings of the AAAI Conference on Artificial Intelligence38, 16 (March 2024), 18471–18480. doi:10.1609/aaai.v38i16.29808

work page doi:10.1609/aaai.v38i16.29808 2024

[28] [28]

Jinpeng Li, Hang Yu, Zhenyu Zhang, Xiangfeng Luo, and Shaorong Xie. 2024. Concept Drift Adaptation by Exploiting Drift Type.ACM Trans. Knowl. Discov. Data18, 4 (Feb. 2024), 96:1–96:22. doi:10.1145/3638777

work page doi:10.1145/3638777 2024

[29] [29]

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated Optimization in Heterogeneous Networks. In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.), Vol. 2. 429–450

2020

[30] [30]

Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michi- hiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, et al

[31] [31]

Holistic Evaluation of Language Models.Transactions on Machine Learning Research(2023)

2023

[32] [32]

Xinting Liao, Weiming Liu, Pengyang Zhou, Fengyuan Yu, Jiahe Xu, Jun Wang, Wenjie Wang, Chaochao Chen, and Xiaolin Zheng. 2024. FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection. In Advances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Ed...

2024

[33] [33]

Jie Lu, Anjin Liu, Fan Dong, Feng Gu, João Gama, and Guangquan Zhang. 2019. Learning under Concept Drift: A Review.IEEE Transactions on Knowledge and Data Engineering31, 12 (Dec. 2019), 2346–2363. doi:10.1109/TKDE.2018.2876857

work page doi:10.1109/tkde.2018.2876857 2019

[34] [34]

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. 2022. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Silent Failures in Federated Personalization of Foundation Models KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea Prompt Order Sensitivity. InProceedings of the 60th Annual Meeting of the ...

work page doi:10.18653/v1/2022.acl-long.556 2022

[35] [35]

Lingjuan Lyu, Han Yu, Xingjun Ma, Chen Chen, Lichao Sun, Jun Zhao, Qiang Yang, and Philip S. Yu. 2024. Privacy and Robustness in Federated Learning: Attacks and Defenses.IEEE Transactions on Neural Networks and Learning Systems 35, 7 (July 2024), 8726–8746. doi:10.1109/TNNLS.2022.3216981

work page doi:10.1109/tnnls.2022.3216981 2024

[36] [36]

Wuyuao Mai, Geng Hong, Pei Chen, Xudong Pan, Baojun Liu, Yuan Zhang, Haixin Duan, and Min Yang. 2025. You Can’t Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense. InProceedings of the ACM on Web Conference 2025. Association for Computing Machinery, Sydney NSW, Australia, 872–883

2025

[37] [37]

Mohamad Mansouri, Melek Önen, and Wafa Ben Jaballah. 2022. Learning from Failures: Secure and Fault-Tolerant Aggregation for Federated Learning. InPro- ceedings of the 38th Annual Computer Security Applications Conference. ACM, Austin TX USA, 146–158. doi:10.1145/3564625.3568135

work page doi:10.1145/3564625.3568135 2022

[38] [38]

Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, et al . 2024. HarmBench: a standardized evaluation framework for automated red teaming and robust refusal. InProceedings of the 41st International Conference on Machine Learning (ICML’24, Vol. 235). JMLR.org, Vienna, Austria, 35181–35224

2024

[39] [39]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. InProceedings of the 20th International Conference on Artificial Intelligence and Statistics, Aarti Singh and Jerry Zhu (Eds.), Vol. 54. PMLR, Proceedings of Machine Learning Research, 1273–1282

2017

[40] [40]

A unifying view on dataset shift in classification

Jose G. Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognition45, 1 (Jan. 2012), 521–530. doi:10.1016/j.patcog.2011.06.019

work page doi:10.1016/j.patcog.2011.06.019 2012

[41] [41]

Jean Ogier du Terrail, Samy-Safwan Ayed, Edwige Cyffers, Felix Grimberg, Chaoyang He, Regis Loeb, Paul Mangold, Tanguy Marchand, Othmane Mar- foq, Erum Mushtaq, et al. 2022. FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings. InAdvances in Neural In- formation Processing Systems, S. Koyejo, S. Mohamed, A. Ag...

2022

[42] [42]

Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul Choo, Alexander Hauptmann, Zhi-Qi Cheng, and Kyungwoo Song. 2024. Towards Calibrated Robust Fine-Tuning of Vision-Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curra...

2024

[43] [43]

YongKyung Oh and Alex Bui. 2025. Multi-View Contrastive Learning for Ro- bust Domain Adaptation in Medical Time Series Analysis. InProceedings of the sixth Conference on Health, Inference, and Learning, Xuhai Orson Xu, Edward Choi, Pankhuri Singhal, Walter Gerych, Shengpu Tang, Monica Agrawal, Adarsh Subbaswamy, Elena Sizikova, Jessilyn Dunn, Roxana Danes...

2025

[44] [44]

Mahdi Pakdaman Naeini, Gregory Cooper, and Milos Hauskrecht. 2015. Obtaining Well Calibrated Probabilities Using Bayesian Binning.Proceedings of the AAAI Conference on Artificial Intelligence29, 1 (Feb. 2015). doi:10.1609/aaai.v29i1.9602

work page doi:10.1609/aaai.v29i1.9602 2015

[45] [45]

Jiaming Pei, Wenxuan Liu, Jinhai Li, Lukun Wang, and Chao Liu. 2024. A Review of Federated Learning Methods in Heterogeneous Scenarios.IEEE Transactions on Consumer Electronics70, 3 (Aug. 2024), 5983–5999. doi:10.1109/TCE.2024.3385440

work page doi:10.1109/tce.2024.3385440 2024

[46] [46]

Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. 2022. Red Teaming Language Models with Language Models. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computa...

work page doi:10.18653/v1/2022.emnlp-main.225 2022

[47] [47]

Krishna Pillutla, Kshitiz Malik, Abdel-Rahman Mohamed, Mike Rabbat, Maziar Sanjabi, and Lin Xiao. 2022. Federated Learning with Partial Model Personal- ization. InProceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Ste- fanie Jegelka, Le Song, Csaba Szepesvari, Gang...

2022

[48] [48]

Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, and Pe- ter Henderson. 2024. Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

2024

[49] [49]

A Survey on Foundation Models for Personalized Federated Intelligence

Yu Qiao, Huy Q. Le, Avi Deb Raha, Phuong-Nam Tran, Apurba Adhikary, Mengchun Zhang, Loc X. Nguyen, Eui-Nam Huh, Dusit Niyato, and Choong Seon Hong. 2025. Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence. doi:10.48550/ arXiv.2505.06907

work page internal anchor Pith review Pith/arXiv arXiv 2025

[50] [50]

Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Bo Zhao, Liping Yi, Alysa Ziying Tan, Yulan Gao, Anran Li, Xiaoxiao Li, et al. 2025. Advances and Open Challenges in Federated Foundation Models.IEEE Communications Surveys & Tutorials(2025), 1–1. doi:10.1109/comst.2025.3552524

work page doi:10.1109/comst.2025.3552524 2025

[51] [51]

Hashimoto, and Percy Liang

Shiori Sagawa, Pang Wei Koh, Tatsunori B. Hashimoto, and Percy Liang. 2020. Distributionally Robust Neural Networks.. In8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020

2020

[52] [52]

Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, et al

Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, et al. 2023. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.Trans. Mach. Learn. Res.2023 (2023)

2023

[53] [53]

Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, et al. 2023. Decod- ingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. In Advances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Glober- son, K. Saenko, M. Hardt, and S. Levine (Eds.), V...

2023

[54] [54]

Papailiopoulos, and Yasaman Khazaeni

Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris S. Papailiopoulos, and Yasaman Khazaeni. 2020. Federated Learning with Matched Averaging.. In8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020

2020

[55] [55]

Albert Webson and Ellie Pavlick. 2022. Do Prompt-Based Models Really Un- derstand the Meaning of Their Prompts?(Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Hu- man Language Technologies). Association for Computational Linguistics, Seattle, United States, 2300–2344

2022

[56] [56]

Alexander Wei, Nika Haghtalab, and Jacob Steinhardt. 2023. Jailbroken: How Does LLM Safety Training Fail?. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 80079–80110

2023

[57] [57]

Sierra Wyllie, Ilia Shumailov, and Nicolas Papernot. 2024. Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, Rio de Janeiro, Brazil, 2113–2147

2024

[58] [58]

Zhao XU, Fan LIU, and Hao LIU. 2024. Bag of Tricks: Benchmarking of Jail- break Attacks on LLMs. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 32219–32250

2024

[59] [59]

Guanhui Yang, Xiaoting Chen, Tengsen Zhang, Shuo Wang, and Yun Yang. 2023. An Impact Study of Concept Drift in Federated Learning. In2023 IEEE Interna- tional Conference on Data Mining (ICDM). 1457–1462. doi:10.1109/ICDM58522. 2023.00191

work page doi:10.1109/icdm58522 2023

[60] [60]

Lei Yang, Jiaming Huang, Wanyu Lin, and Jiannong Cao. 2023. Personalized Federated Learning on Non-IID Data via Group-based Meta-learning.ACM Trans. Knowl. Discov. Data17, 4 (March 2023), Article 49. doi:10.1145/3558005

work page doi:10.1145/3558005 2023

[61] [61]

Yiyuan Yang, Guodong Long, Tao Shen, Jing Jiang, and Michael Blumenstein

[62] [62]

InAdvances in Neural Information Processing Systems, A

Dual-Personalizing Adapter for Federated Foundation Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 39409–39433

[63] [63]

Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Zhibo Sun, and Yue Zhang

[64] [64]

doi:10.1016/j.hcc.2024.100211

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly.High-Confidence Computing4, 2 (June 2024), 100211. doi:10.1016/j.hcc.2024.100211

work page doi:10.1016/j.hcc.2024.100211 2024

[65] [65]

Yuen, and Dacheng Tao

Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, and Dacheng Tao. 2023. Hetero- geneous Federated Learning: State-of-the-art and Research Challenges.ACM Comput. Surv.56, 3 (2023), Article 79

2023

[66] [66]

Rui Ye, Rui Ge, Xinyu Zhu, Jingyi Chai, Yaxin Du, Yang Liu, Yanfeng Wang, and Siheng Chen. 2024. FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc.,...

work page doi:10.52202/079017-3528 2024

[67] [67]

Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, and Siheng Chen. 2024. OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24). Association for Computing Machinery, New York, NY, U...

work page doi:10.1145/3637528.3671582 2024

[68] [68]

Baosheng Zhang, Yuchen Guo, Yipeng Li, Yuwei He, Haoqian Wang, and Qionghai Dai. 2022. Memory Recall: A Simple Neural Network Training Framework Against Catastrophic Forgetting.IEEE Transactions on Neural Networks and Learning Systems33, 5 (May 2022), 2010–2022. doi:10.1109/TNNLS.2021.3099700

work page doi:10.1109/tnnls.2021.3099700 2022

[69] [69]

Jianqing Zhang, Yang Liu, Yang Hua, Hao Wang, Tao Song, Zhengui Xue, Ruhui Ma, and Jian Cao. 2025. PFLlib: A Beginner-Friendly and Comprehensive Person- alized Federated Learning Library and Benchmark.Journal of Machine Learning Research26, 50 (2025), 1–10

2025

[70] [70]

Junyuan Zhang, Shuang Zeng, Miao Zhang, Runxi Wang, Feifei Wang, Yuyin Zhou, Paul Pu Liang, and Liangqiong Qu. 2024. FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning.. InIEEE/CVF Conference KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea YongKyung Oh and Alex Bui on Computer Vision and Pattern Recognition, CVPR 202...

work page doi:10.1109/cvpr52733.2024.01150 2024

[71] [71]

Wenhao Zhang, Zimu Zhou, Yansheng Wang, and Yongxin Tong. 2023. DM-PFL: Hitchhiking Generic Federated Learning for Efficient Shift-Robust Personaliza- tion. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, Long Beach, CA, USA, 3396–3408

2023

[72] [72]

Yifei Zhang, Dun Zeng, Jinglong Luo, Zenglin Xu, and Irwin King. 2023. A Survey of Trustworthy Federated Learning with Perspectives on Security, Robustness and Privacy. InCompanion Proceedings of the ACM Web Conference 2023. Association for Computing Machinery, Austin, TX, USA, 1167–1176

2023

[73] [73]

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Zican Dong, Yupeng Hou, Beichen Zhang, Yingqian Min, Junjie Zhang, Peiyu Liu, et al. 2026. A Survey of Large Language Models.Frontiers of Computer Science20, 12 (May 2026), 2012627. doi:10.1007/s11704-026-60308-3

work page doi:10.1007/s11704-026-60308-3 2026

[74] [74]

Yuang Zhao, Chuhan Wu, Qinglin Jia, Hong Zhu, Jia Yan, Libin Zong, Linxuan Zhang, Zhenhua Dong, and Muyu Zhang. 2024. Confidence-Aware Multi-Field Model Calibration. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Boise, ID, USA, 5111–5118. A Benchmark Paper Overviews Thi...

2024