EconWebArena is a new benchmark with 360 curated economic tasks across 82 authoritative websites for evaluating multimodal web agents on navigation, grounding, and data extraction.
arXiv preprint arXiv:2302.09432
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
FinDocMRE is a new multi-image document-level benchmark spanning 12 financial domains and 5 task types, showing that 11 tested LMMs all score below 65 overall with particular weaknesses in numerical estimation and cross-page grounding.
citing papers explorer
-
EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments
EconWebArena is a new benchmark with 360 curated economic tasks across 82 authoritative websites for evaluating multimodal web agents on navigation, grounding, and data extraction.
-
FinDocMRE: A Benchmark for Document-Level Financial Multimodal Reasoning Evaluation
FinDocMRE is a new multi-image document-level benchmark spanning 12 financial domains and 5 task types, showing that 11 tested LMMs all score below 65 overall with particular weaknesses in numerical estimation and cross-page grounding.
- MetaGraph: A Large-Scale Meta-Analysis of GenAI in Financial NLP (2022-2025)