Finben: A holistic finan- cial benchmark for large language models

Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, Guojun Xio · 2024 · DOI 10.52202/079017-3033

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

AuditFraudBench: Benchmarking Audit Judgment in Detecting Fraudulent Misstatements

cs.CE · 2026-06-06 · unverdicted · novelty 7.0

AuditFraudBench is a new enforcement-grounded benchmark with three tasks for testing whether LLMs can detect fraudulent misstatements by reasoning over financial figures, disclosure framing, and known manipulation patterns.

Herculean: An Agentic Benchmark for Financial Intelligence

cs.AI · 2026-05-14 · unverdicted · novelty 7.0

Herculean benchmark shows frontier agents handle trading and market insights better than hedging and auditing workflows that demand state consistency and structured verification.

citing papers explorer

Showing 2 of 2 citing papers after filters.

AuditFraudBench: Benchmarking Audit Judgment in Detecting Fraudulent Misstatements cs.CE · 2026-06-06 · unverdicted · none · ref 38
AuditFraudBench is a new enforcement-grounded benchmark with three tasks for testing whether LLMs can detect fraudulent misstatements by reasoning over financial figures, disclosure framing, and known manipulation patterns.
Herculean: An Agentic Benchmark for Financial Intelligence cs.AI · 2026-05-14 · unverdicted · none · ref 7
Herculean benchmark shows frontier agents handle trading and market insights better than hedging and auditing workflows that demand state consistency and structured verification.

Finben: A holistic finan- cial benchmark for large language models

fields

years

verdicts

representative citing papers

citing papers explorer