On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Andy Zou; Anka Reuel; Bo Li; Bryan Hooi Kuen-Yew; Caiming Xiong; Chaowei Xiao; Chujie Gao; Dawn Song; Dongping Chen; Elias Stengel-Eskin

arxiv: 2502.14296 · v5 · pith:RGERRQACnew · submitted 2025-02-20 · 💻 cs.CY

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Yue Huang , Chujie Gao , Siyuan Wu , Haoran Wang , Xiangqi Wang , Yujun Zhou , Yanbo Wang , Jiayi Ye

show 58 more authors

Jiawen Shi Qihui Zhang Yuan Li Han Bao Zhaoyi Liu Tianrui Guan Dongping Chen Ruoxi Chen Kehan Guo Andy Zou Bryan Hooi Kuen-Yew Caiming Xiong Elias Stengel-Eskin Hongyang Zhang Hongzhi Yin Huan Zhang Huaxiu Yao Jaehong Yoon Jieyu Zhang Kai Shu Kaijie Zhu Ranjay Krishna Swabha Swayamdipta Taiwei Shi Weijia Shi Xiang Li Yiwei Li Yuexing Hao Zhihao Jia Zhize Li Xiuying Chen Zhengzhong Tu Xiyang Hu Tianyi Zhou Jieyu Zhao Lichao Sun Furong Huang Or Cohen Sasson Prasanna Sattigeri Anka Reuel Max Lamparth Yue Zhao Nouha Dziri Yu Su Huan Sun Heng Ji Chaowei Xiao Mohit Bansal Nitesh V. Chawla Jian Pei Jianfeng Gao Michael Backes Philip S. Yu Neil Zhenqiang Gong Pin-Yu Chen Bo Li Dawn Song Xiangliang Zhang

This is my paper

classification 💻 cs.CY

keywords trustworthinesschallengesgenfmsmodelstrustgenacrossapplicationscritical

0 comments

read the original abstract

Generative Foundation Models (GenFMs) have emerged as transformative tools. However, their widespread adoption raises critical concerns regarding trustworthiness across dimensions. This paper presents a comprehensive framework to address these challenges through three key contributions. First, we systematically review global AI governance laws and policies from governments and regulatory bodies, as well as industry practices and standards. Based on this analysis, we propose a set of guiding principles for GenFMs, developed through extensive multidisciplinary collaboration that integrates technical, ethical, legal, and societal perspectives. Second, we introduce TrustGen, the first dynamic benchmarking platform designed to evaluate trustworthiness across multiple dimensions and model types, including text-to-image, large language, and vision-language models. TrustGen leverages modular components--metadata curation, test case generation, and contextual variation--to enable adaptive and iterative assessments, overcoming the limitations of static evaluation methods. Using TrustGen, we reveal significant progress in trustworthiness while identifying persistent challenges. Finally, we provide an in-depth discussion of the challenges and future directions for trustworthy GenFMs, which reveals the complex, evolving nature of trustworthiness, highlighting the nuanced trade-offs between utility and trustworthiness, and consideration for various downstream applications, identifying persistent challenges and providing a strategic roadmap for future research. This work establishes a holistic framework for advancing trustworthiness in GenAI, paving the way for safer and more responsible integration of GenFMs into critical applications. To facilitate advancement in the community, we release the toolkit for dynamic evaluation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Uncertainty Quantification for Distribution-to-Distribution Flow Matching in Scientific Imaging
cs.LG 2026-03 unverdicted novelty 6.0

Bayesian Stochastic Flow Matching augments flow models with stochastic diffusion for better generalization and uses Monte Carlo Dropout with antithetic sampling to disentangle uncertainties and detect out-of-distribut...
Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs
cs.LG 2026-04 unverdicted novelty 5.0

Guardian-as-an-Advisor prepends risk labels and explanations from a guardian model to queries, improving LLM safety compliance and reducing over-refusal while adding minimal compute overhead.
Emergent Social Intelligence Risks in Generative Multi-Agent Systems
cs.MA 2026-03 unverdicted novelty 5.0

Generative multi-agent systems exhibit emergent collusion and conformity behaviors that cannot be prevented by existing agent-level safeguards.
Towards provable probabilistic safety for scalable embodied AI systems
eess.SY 2025-06 unverdicted novelty 4.0

The paper proposes a paradigm of provable probabilistic safety to enable scalable, safe deployment of embodied AI in critical applications.