arXiv preprint arXiv:2601.08689

Zhaolu Kang, Junhao Gong, Wenqing Hu, Shuo Yin, Kehan Jiang, Zhicheng Fang, Yingjie He, Chunlei Meng, Rong Fu, Dongyang Chen, Leqi Zheng, Eric Hanchen Jiang, Yunfei Feng, Yitong Leng, Junfan Zhu, Xiaoyou Chen, Xi Yang, Richeng Xuan · 2026 · arXiv 2601.08689

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SysTradeBench: An Iterative Build-Test-Patch Benchmark for Strategy-to-Code Trading Systems with Drift-Aware Diagnostics

cs.SE · 2026-04-06 · unverdicted · novelty 6.0

SysTradeBench evaluates 17 LLMs on 12 trading strategies, finding over 91.7% code validity but rapid convergence in iterative fixes and a continued need for human oversight on critical strategies.

Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval

cs.CL · 2026-03-26 · unverdicted · novelty 5.0

HDRR combines document-level semantic routing with scoped chunk retrieval to outperform both pure chunk-based retrieval and semantic file routing on the FinDER benchmark, delivering higher average scores, lower failure rates, and more perfect answers.

BacktestBench: Benchmarking Large Language Models for Automated Quantitative Strategy Backtesting

cs.CL · 2026-05-18

citing papers explorer

Showing 3 of 3 citing papers.

SysTradeBench: An Iterative Build-Test-Patch Benchmark for Strategy-to-Code Trading Systems with Drift-Aware Diagnostics cs.SE · 2026-04-06 · unverdicted · none · ref 27
SysTradeBench evaluates 17 LLMs on 12 trading strategies, finding over 91.7% code validity but rapid convergence in iterative fixes and a continued need for human oversight on critical strategies.
Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval cs.CL · 2026-03-26 · unverdicted · none · ref 7
HDRR combines document-level semantic routing with scoped chunk retrieval to outperform both pure chunk-based retrieval and semantic file routing on the FinDER benchmark, delivering higher average scores, lower failure rates, and more perfect answers.
BacktestBench: Benchmarking Large Language Models for Automated Quantitative Strategy Backtesting cs.CL · 2026-05-18 · unreviewed · ref 11

arXiv preprint arXiv:2601.08689

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer