← back to paper
arxiv: 2605.22664 · 2 revisions
MBABench: Evaluating LLM Agents on End-to-End Spreadsheet Tasks in Finance