arxiv: 2604.03976 · v2 · submitted 2026-04-05 · 💻 cs.AI · cs.CE

Recognition: no theorem link

Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

Wenyue Hua , Tianyi Peng , Chi Wang , Jiaxin Pei , Ian Kaufman , Bryan Lim , Chandler Fang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 17:13 UTC · model grok-4.3

classification 💻 cs.AI cs.CE

keywords trustworthy AIAI agentsrisk managementfinancial underwritingAgentic Risk Standardcompensationautonomous systemsAI safety

0 comments

The pith

The Agentic Risk Standard turns implicit expectations of AI agent behavior into explicit, contractually enforceable compensation for failures and misalignments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Prior work on trustworthy AI centers on internal model properties such as bias mitigation and robustness. As agents become autonomous and handle payments or assets, trust instead requires reliable end-to-end outcomes that stochastic behavior makes impossible to guarantee through technical safeguards alone. The paper proposes the Agentic Risk Standard as a financial underwriting framework that embeds risk assessment, underwriting, and predefined compensation directly into each transaction. Users thereby receive contractually binding payouts for execution failures, intent misalignment, or unintended harms. A simulation study examines the resulting social benefits of this shift from model-level reliability to product-level guarantees.

Core claim

The paper establishes that the Agentic Risk Standard integrates risk assessment, underwriting, and compensation into a single transaction framework for AI-mediated transactions. Under ARS, users receive predefined and contractually enforceable compensation in cases of execution failure, misalignment, or unintended outcomes. This shifts trust from an implicit expectation about model behavior to an explicit, measurable, and enforceable product guarantee.

What carries the argument

The Agentic Risk Standard (ARS), a payment settlement standard that combines risk assessment, underwriting, and compensation into AI agent transactions to create enforceable user guarantees.

Load-bearing premise

Agent risks are fundamentally product-level and cannot be eliminated by technical safeguards alone, so a financial compensation layer is required to create enforceable trust.

What would settle it

A real-world deployment in which technical safeguards alone eliminate all material user harms from agent stochasticity without any compensation mechanism.

Figures

Figures reproduced from arXiv: 2604.03976 by Bryan Lim, Chandler Fang, Chi Wang, Ian Kaufman, Jiaxin Pei, Tianyi Peng, Wenyue Hua.

**Figure 1.** Figure 1: ARS is a transaction-layer assurance standard for agentic services that converts stochastic, outcome-level risk into explicit settlement rules. Without ARS, users must prepay agents (and in fundmoving tasks, also hand over execution capital), exposing them to non-delivery, misexecution, and downstream harms. With ARS, service fees are locked in an escrow vault and released only upon successful evaluation;… view at source ↗

**Figure 2.** Figure 2: The requestor first sends a task specification to the business agent. Both parties may then enter a [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Fee track in the transaction phase: The requestor locks the service fee in an escrow vault before [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Principal Track in the transaction phase: When execution involves user funds (principal), [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: AP2+ARS. AP2 provides authorization evidence and bounded delegation, and ARS adds settlement semantics over the authorized transaction. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

**Figure 6.** Figure 6: VI+ARS: VI governs privacy-preserving authorization through layered credentials and selective disclosure; ARS governs settlement and compensation over the authorized transaction. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: Loading factor sweep: adoption rate, loss reduction rate, failure reduction rate, and underwriter [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗

**Figure 8.** Figure 8: FP/FN sweep: adoption rate, loss reduction rate, failure reduction rate, and underwriter wallet [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: Sigmoid collateral sweep: adoption rate, loss reduction rate, failure reduction rate, and wallet [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗

read the original abstract

Prior work on trustworthy AI emphasizes model-internal properties such as bias mitigation, adversarial robustness, and interpretability. As AI systems evolve into autonomous agents deployed in open environments and increasingly connected to payments or assets, the operational meaning of trust shifts to end-to-end outcomes: whether an agent completes tasks, follows user intent, and avoids failures that cause material or psychological harm. These risks are fundamentally product-level and cannot be eliminated by technical safeguards alone because agent behavior is inherently stochastic. To address this gap between model-level reliability and user-facing assurance, we propose a complementary framework based on risk management. Drawing inspiration from financial underwriting, we introduce the \textbf{Agentic Risk Standard (ARS)}, a payment settlement standard for AI-mediated transactions. ARS integrates risk assessment, underwriting, and compensation into a single transaction framework that protects users when interacting with agents. Under ARS, users receive predefined and contractually enforceable compensation in cases of execution failure, misalignment, or unintended outcomes. This shifts trust from an implicit expectation about model behavior to an explicit, measurable, and enforceable product guarantee. We also present a simulation study analyzing the social benefits of applying ARS to agentic transactions. ARS's implementation can be found at https://github.com/t54-labs/AgenticRiskStandard.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ARS gives a workable financial compensation layer for AI agent failures, but the simulation evidence stays too high-level to judge real impact.

read the letter

The paper's main move is to treat AI agent risks as insurable product features rather than purely technical problems. By borrowing underwriting and settlement practices, they define the Agentic Risk Standard so users get contractually fixed payouts when an agent fails to execute, misaligns, or produces unintended outcomes. That framing is the clearest part of the work and directly addresses why model-level fixes like robustness or interpretability leave a gap once agents touch real assets or payments.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes the Agentic Risk Standard (ARS), a financial risk-management framework for AI agents that integrates assessment, underwriting, and compensation into transactions. Under ARS, users receive predefined, contractually enforceable payouts for execution failures, misalignment, or unintended outcomes, shifting trust from implicit model behavior to an explicit product guarantee. A simulation study is included to illustrate social benefits of applying ARS to agentic transactions.

Significance. If the ARS framework is sound and implementable, it supplies a complementary, product-level mechanism for managing stochastic risks that technical safeguards cannot fully eliminate, potentially increasing user adoption of payment-connected AI agents. The simulation study is offered as illustrative evidence of positive social impacts rather than a rigorous proof of necessity or sufficiency.

major comments (1)

Simulation study section: the manuscript states that a simulation was performed to analyze social benefits, yet supplies no quantitative metrics, baseline comparisons, statistical tests, or sensitivity analysis. Because the effectiveness and benefit claims rest on this study, the absence of these details is load-bearing for evaluating the framework.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We agree that the simulation study requires substantial expansion to include quantitative metrics, baselines, statistical tests, and sensitivity analysis. The revised manuscript will address this directly while preserving the illustrative intent of the study.

read point-by-point responses

Referee: Simulation study section: the manuscript states that a simulation was performed to analyze social benefits, yet supplies no quantitative metrics, baseline comparisons, statistical tests, or sensitivity analysis. Because the effectiveness and benefit claims rest on this study, the absence of these details is load-bearing for evaluating the framework.

Authors: We concur that the current presentation of the simulation study is insufficiently detailed for rigorous evaluation. In the revised version, we will expand the section to report concrete quantitative metrics such as mean user payout amounts, aggregate social welfare gains, and risk reduction percentages; include explicit baseline comparisons against non-ARS agent transactions; apply appropriate statistical tests (e.g., paired t-tests or Wilcoxon rank-sum tests) with reported p-values and effect sizes; and conduct sensitivity analyses over key parameters including agent failure rates, compensation levels, and transaction volumes. These additions will be presented with tables and figures to allow independent assessment of the claimed social benefits. We maintain that the study remains illustrative of potential impacts rather than a definitive proof, but we will strengthen its methodological transparency as requested. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The manuscript advances a conceptual proposal for the Agentic Risk Standard (ARS) by drawing explicit inspiration from established financial underwriting and risk-management practices. No equations, fitted parameters, or derived quantities appear in the provided text. The central claim—that a contractual compensation layer can shift trust to an explicit product guarantee—rests on the premise that agent risks are stochastic and product-level, which is stated as an assumption rather than derived from any self-referential definition or prior author work. The simulation study is presented as illustrative evidence of social benefits, not as a proof that reduces to fitted inputs or self-citations. The derivation chain is therefore self-contained and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The proposal rests on the domain assumption that technical model safeguards are inherently insufficient for stochastic agent behavior and introduces ARS as a new contractual construct without external validation data in the abstract.

axioms (1)

domain assumption Agent behavior is inherently stochastic and risks cannot be eliminated by technical safeguards alone.
Explicitly stated in the abstract as the reason a complementary risk-management layer is needed.

invented entities (1)

Agentic Risk Standard (ARS) no independent evidence
purpose: A payment settlement standard that integrates risk assessment, underwriting, and compensation for AI-mediated transactions.
Newly defined framework presented without independent empirical support or external benchmarks in the abstract.

pith-pipeline@v0.9.0 · 5538 in / 1298 out tokens · 33717 ms · 2026-05-13T17:13:31.198967+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 11 internal anchors

[1]

Impact of model interpretability and outcome feedback on trust in ai

Daehwan Ahn, Abdullah Almaatouq, Monisha Gulabani, and Kartik Hosanagar. Impact of model interpretability and outcome feedback on trust in ai. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pp. 1–25,

work page 2024
[2]

Rest meets react: Self-improvement for multi-step reasoning llm agent,

Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, et al. Rest meets react: Self-improvement for multi-step reasoning llm agent.arXiv preprint arXiv:2312.10003,

work page arXiv
[3]

Frontier AI regulation: Managing emerging risks to public safety.arXiv preprint arXiv:2307.03718,

Markus Anderljung, Joslyn Barnhart, Anton Korinek, Jade Leung, Cullen O’Keefe, Jess Whittlestone, Sha- har Avin, Miles Brundage, Justin Bullock, Duncan Cass-Beggs, et al. Frontier ai regulation: Managing emerging risks to public safety.arXiv preprint arXiv:2307.03718,

work page arXiv
[4]

arXiv preprint arXiv:2404.14082 (2024)

Leonard Bereska and Efstratios Gavves. Mechanistic interpretability for ai safety–a review.arXiv preprint arXiv:2404.14082,

work page arXiv
[5]

Towards llm-guided causal explainability for black-box text classifiers.arXiv preprint arXiv:2309.13340,

Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, and Huan Liu. Towards llm-guided causal explainability for black-box text classifiers.arXiv preprint arXiv:2309.13340,

work page arXiv
[6]

Llms for explainable ai: A comprehensive survey, 2025

Ahsan Bilal, David Ebert, and Beiyu Lin. Llms for explainable ai: A comprehensive survey.arXiv preprint arXiv:2504.00125,

work page arXiv
[7]

Towards implicit bias detection and mitigation in multi-agent llm interactions

Angana Borah and Rada Mihalcea. Towards implicit bias detection and mitigation in multi-agent llm interactions. InFindings of the Association for Computational Linguistics: EMNLP 2024, pp. 9306–9326,

work page 2024
[8]

Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901,

26 Financial Risk Management for Trustworthy AI Agents Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901,

work page 1901
[9]

com/news/articles/c62n410w5yno

Tianyu Chen, Dongrui Liu, Xia Hu, Jingyi Yu, and Wenjie Wang. A trajectory-based safety audit of clawdbot (openclaw).arXiv preprint arXiv:2602.14364,

work page arXiv
[10]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261,

work page internal anchor Pith review Pith/arXiv arXiv
[11]

Composerx: Multi-agent symbolic music composition with llms,

Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, et al. Composerx: Multi-agent symbolic music composition with llms.arXiv preprint arXiv:2404.18081,

work page arXiv
[12]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidi- rectional transformers for language understanding. InProceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp. 4171–4186,

work page internal anchor Pith review 2019
[13]

Xiaoning Dong, Wenbo Hu, Wei Xu, and Tianxing He

Han Ding, Yinheng Li, Junhao Wang, and Hang Chen. Large language model agent in financial trading: A survey.arXiv preprint arXiv:2408.06361,

work page arXiv
[14]

The paradox of stochasticity: Limited creativity and computational decoupling in temperature-varied llm outputs of structured fictional data.arXiv preprint arXiv:2502.08515,

Evgenii Evstafev. The paradox of stochasticity: Limited creativity and computational decoupling in temperature-varied llm outputs of structured fictional data.arXiv preprint arXiv:2502.08515,

work page arXiv
[15]

Safethinker: Reasoning about risk to deepen safety beyond shallow alignment

Xianya Fang, Xianying Luo, Yadong Wang, Xiang Chen, Yu Tian, Zequn Sun, Rui Liu, Jun Fang, Naiqiang Tan, Yuanning Cui, et al. Safethinker: Reasoning about risk to deepen safety beyond shallow alignment. arXiv preprint arXiv:2601.16506,

work page arXiv
[16]

Faithful ex- planations of black-box nlp models using llm-generated counterfactuals.arXiv preprint arXiv:2310.00603,

Yair Gat, Nitay Calderon, Amir Feder, Alexander Chapanin, Amit Sharma, and Roi Reichart. Faithful ex- planations of black-box nlp models using llm-generated counterfactuals.arXiv preprint arXiv:2310.00603,

work page arXiv
[17]

Mart: Improving llm safety with multi-round automatic red-teaming

Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, and Yuning Mao. Mart: Improving llm safety with multi-round automatic red-teaming. InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pp. 1927–1937,

work page 2024
[18]

Mcp- agentbench: Evaluating real-world language agent performance with mcp-mediated tools.arXiv preprint arXiv:2509.09734,

Zikang Guo, Benfeng Xu, Chiwei Zhu, Wentao Hong, Xiaorui Wang, and Zhendong Mao. Mcp- agentbench: Evaluating real-world language agent performance with mcp-mediated tools.arXiv preprint arXiv:2509.09734,

work page arXiv
[19]

Ai regulation in europe: from the ai act to future regulatory challenges.arXiv preprint arXiv:2310.04072,

Philipp Hacker. Ai regulation in europe: from the ai act to future regulatory challenges.arXiv preprint arXiv:2310.04072,

work page arXiv
[20]

Memory in the Age of AI Agents

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, et al. Memory in the age of ai agents.arXiv preprint arXiv:2512.13564,

work page internal anchor Pith review Pith/arXiv arXiv
[21]

Memory os of ai agent

Jiazheng Kang, Mingming Ji, Zhe Zhao, and Ting Bai. Memory os of ai agent. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 25972–25981,

work page 2025
[22]

doi:10.48550/arXiv.2406.06469

Joongwon Kim, Bhargavi Paranjape, Tushar Khot, and Hannaneh Hajishirzi. Husky: A unified, open- source language agent for multi-step reasoning.arXiv preprint arXiv:2406.06469,

work page arXiv
[23]

Certifying llm safety against adversarial prompting

Aounon Kumar, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, and Himabindu Lakkaraju. Certifying llm safety against adversarial prompting.arXiv preprint arXiv:2309.02705,

work page arXiv
[24]

Decoding biases: Automated methods and llm judges for gender bias detection in language models.arXiv preprint arXiv:2408.03907,

Shachi H Kumar, Saurav Sahay, Sahisnu Mazumder, Eda Okur, Ramesh Manuvinakurike, Nicole Beckage, Hsuan Su, Hung-yi Lee, and Lama Nachman. Decoding biases: Automated methods and llm judges for gender bias detection in language models.arXiv preprint arXiv:2408.03907,

work page arXiv
[25]

Cryptotrade: A reflective llm-based agent to guide zero-shot cryptocurrency trading

Yuan Li, Bingqiao Luo, Qian Wang, Nuo Chen, Xu Liu, and Bingsheng He. Cryptotrade: A reflective llm-based agent to guide zero-shot cryptocurrency trading. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp. 1094–1106,

work page 2024
[26]

Slidegen: Collaborative multimodal agents for scientific slide generation.arXiv preprint arXiv:2512.04529, 2025

Xin Liang, Xiang Zhang, Yiwei Xu, Siqi Sun, and Chenyu You. Paper2slide: A multi-agent framework for automatic scientific slide generation. Xin Liang, Xiang Zhang, Yiwei Xu, Siqi Sun, and Chenyu You. Slidegen: Collaborative multimodal agents for scientific slide generation.arXiv preprint arXiv:2512.04529,

work page arXiv
[27]

DeepSeek-V3 Technical Report

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. Deepseek-v3 technical report.arXiv preprint arXiv:2412.19437,

work page internal anchor Pith review Pith/arXiv arXiv
[28]

Prompt Injection attack against LLM-integrated Applications

Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, et al. Prompt injection attack against llm-integrated applications.arXiv preprint arXiv:2306.05499,

work page internal anchor Pith review Pith/arXiv arXiv
[29]

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692,

work page internal anchor Pith review Pith/arXiv arXiv 1907
[30]

arXiv preprint arXiv:2404.11584 , year=

28 Financial Risk Management for Trustworthy AI Agents Tula Masterman, Sandi Besen, Mason Sawtell, and Alex Chao. The landscape of emerging ai agent architectures for reasoning, planning, and tool calling: A survey.arXiv preprint arXiv:2404.11584,

work page arXiv
[31]

arXiv preprint arXiv:2501.09967 , year=

Fuseini Mumuni and Alhassan Mumuni. Explainable artificial intelligence (xai): from inherent explain- ability to large language models.arXiv preprint arXiv:2501.09967,

work page arXiv
[32]

Mobileflow: A multimodal llm for mobile gui agent.arXiv preprint arXiv:2407.04346, 2024

Songqin Nong, Jiali Zhu, Rui Wu, Jiongchao Jin, Shuo Shan, Xiutian Huang, and Wenhao Xu. Mobileflow: A multimodal llm for mobile gui agent.arXiv preprint arXiv:2407.04346,

work page arXiv
[33]

Investorbench: A benchmark for financial decision-making tasks with llm-based agent

Lingfei Qian, Xueqing Peng, Yan Wang, Vincent Jim Zhang, Huan He, Hanley Smith, Yi Han, Yueru He, Haohang Li, Yupeng Cao, et al. When agents trade: Live multi-market trading benchmark for llm agents.arXiv preprint arXiv:2510.11695,

work page arXiv
[34]

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. Toolllm: Facilitating large language models to master 16000+ real-world apis.arXiv preprint arXiv:2307.16789,

work page internal anchor Pith review Pith/arXiv arXiv
[35]

A Comprehensive Survey of Agents for Computer Use: Foundations, Challenges, and Future Directions

Pascal J Sager, Benjamin Meyer, Peng Yan, Rebekka von Wartburg-Kottler, Layan Etaiwi, Aref Enayati, Gabriel Nobel, Ahmed Abdulkadir, Benjamin F Grewe, and Thilo Stadelmann. A comprehensive survey of agents for computer use: Foundations, challenges, and future directions.arXiv preprint arXiv:2501.16150,

work page internal anchor Pith review Pith/arXiv arXiv
[36]

Enhancing

Sivan Schwartz, Avi Yaeli, and Segev Shlomov. Enhancing trust in llm-based ai automation agents: New considerations and future challenges.arXiv preprint arXiv:2308.05391,

work page arXiv
[37]

Rethinking interpretability in the era of large language models.arXiv preprint arXiv:2402.01761, 2024

Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, and Jianfeng Gao. Rethinking interpretability in the era of large language models.arXiv preprint arXiv:2402.01761,

work page arXiv
[38]

Autoagent: A fully-automated and zero-code framework for llm agents.arXiv preprint arXiv:2502.05957,

Jiabin Tang, Tianyu Fan, and Chao Huang. Autoagent: A fully-automated and zero-code framework for llm agents.arXiv preprint arXiv:2502.05957,

work page arXiv
[39]

Adversarial preference learning for robust llm alignment

Yuanfu Wang, Pengyu Wang, Chenyang Xi, Bo Tang, Junyi Zhu, Wenqiang Wei, Chen Chen, Chao Yang, Jingfeng Zhang, Chaochao Lu, et al. Adversarial preference learning for robust llm alignment. In Findings of the Association for Computational Linguistics: ACL 2025, pp. 21865–21881,

work page 2025
[40]

arXiv preprint arXiv:2412.20138 , year =

29 Financial Risk Management for Trustworthy AI Agents Yijia Xiao, Edward Sun, Di Luo, and Wei Wang. Tradingagents: Multi-agents llm financial trading framework.arXiv preprint arXiv:2412.20138,

work page arXiv
[41]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025a. Xianjun Yang, Xiao Wang, Qi Zhang, Linda Petzold, William Yang Wang, Xun Zhao, and Dahua Lin. Shadow alignment: The ease of subverting safely-aligned language models.ar...

work page internal anchor Pith review Pith/arXiv arXiv
[42]

A survey of ai agent protocols.arXiv preprint arXiv:2504.16736,

Yingxuan Yang, Huacan Chai, Yuanyi Song, Siyuan Qi, Muning Wen, Ning Li, Junwei Liao, Haoyi Hu, Jianghao Lin, Gaowei Chang, et al. A survey of ai agent protocols.arXiv preprint arXiv:2504.16736, 2025b. Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, and Jiang Bian. Musicagent: An ai agent for music understanding and generatio...

work page arXiv
[43]

Large language model-brained gui agents: A survey,

Chaoyun Zhang, Shilin He, Jiaxu Qian, Bowen Li, Liqun Li, Si Qin, Yu Kang, Minghua Ma, Guyue Liu, Qingwei Lin, et al. Large language model-brained gui agents: A survey.arXiv preprint arXiv:2411.18279,

work page arXiv
[44]

PosterGen: Aesthetic-Aware Multi-Modal Paper-to-Poster Generation via Multi-Agent LLMs

Chi Zhang, Zhao Yang, Jiaxuan Liu, Yanda Li, Yucheng Han, Xin Chen, Zebiao Huang, Bin Fu, and Gang Yu. Appagent: Multimodal agents as smartphone users. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pp. 1–20, 2025a. Zhilin Zhang, Xiang Zhang, Jiaqi Wei, Yiwei Xu, and Chenyu You. Postergen: Aesthetic-aware paper-to- poster ...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

Universal and Transferable Adversarial Attacks on Aligned Language Models

Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J Zico Kolter, and Matt Fredrikson. Universal and transferable adversarial attacks on aligned language models.arXiv preprint arXiv:2307.15043,

work page internal anchor Pith review Pith/arXiv arXiv