hub

arXiv preprint arXiv:2105.07624 , year =

Fengbin Zhu, Wenqiang Lei, et al · 2021 · arXiv 2105.07624

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

FIND: Toward Multimodal Financial Reasoning and Question Answering for Indic Languages

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

FinVQA is a new multilingual benchmark for Indic financial VQA with three difficulty levels and four formats, paired with the FIND framework for faithful numerical reasoning via fine-tuning and constrained decoding.

Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain

cs.CL · 2026-05-09 · unverdicted · novelty 7.0

LLMs copy biased analyst ratings in investment decisions but a new detection method encourages independent reasoning and can improve stock return predictions beyond human levels.

INDOTABVQA: A Benchmark for Cross-Lingual Table Understanding in Bahasa Indonesia Documents

cs.CV · 2026-04-13 · conditional · novelty 7.0

INDOTABVQA is a new benchmark dataset for cross-lingual table visual question answering on Bahasa Indonesia documents that exposes VLM weaknesses on complex tables and low-resource languages while showing gains from fine-tuning and table region coordinates.

FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs

cs.CL · 2025-10-10 · unverdicted · novelty 7.0

FinAuditing is a taxonomy-structured multi-document benchmark with 1,102 instances averaging over 33k tokens from XBRL filings, defining three tasks to evaluate LLMs on financial auditing capabilities.

Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective

cs.CL · 2025-05-27 · unverdicted · novelty 7.0

MAMMQA is a multi-agent framework that decomposes multimodal queries, retrieves modality-specific answers, performs cross-modal synthesis with VLMs, and integrates results via an LLM to outperform single-model baselines on QA benchmarks.

FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information

cs.CL · 2025-05-27 · unverdicted · novelty 7.0

FinTagging decomposes XBRL tagging into FinNI extraction and FinCL full-taxonomy linking, showing LLMs handle extraction but struggle with fine-grained concept alignment in zero-shot settings.

FLARE: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

cs.CV · 2025-04-14 · unverdicted · novelty 7.0

FLARE is a vision-language model family using text-guided vision encoding, context-aware alignment decoding, dual-semantic mapping loss, and text-driven VQA synthesis to achieve deep cross-modal integration, outperforming larger models with only 630 vision tokens at 3B scale.

Design and Report Benchmarks for Knowledge Work

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.

FINESSE-Bench: A Hierarchical Benchmark Suite for Financial Domain Knowledge and Technical Analysis in Large Language Models

cs.CL · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

FINESSE-Bench is a new hierarchical benchmark suite combining certification-style exams, trading tasks, and a Russian olympiad set to evaluate LLMs on financial competencies at multiple difficulty levels.

The Power of Order: Fooling LLMs with Adversarial Table Permutations

cs.LG · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

Semantically invariant row and column permutations in tables can cause LLMs to output incorrect answers, and a gradient-based attack called ATP efficiently finds such permutations that degrade performance across many models.

Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning

cs.LG · 2026-04-23 · unverdicted · novelty 5.0

TaNOS decouples table semantics from numerical structure via anonymization, sketches, and program-first self-supervision, yielding 80.13% FinQA accuracy with 10% data and near-zero cross-domain gap versus over 10pp for standard fine-tuning.

Empirical Evaluation of PDF Parsing and Chunking for Financial Question Answering with RAG

cs.CL · 2026-04-13 · unverdicted · novelty 5.0

Systematic tests show that specific PDF parsers combined with overlapping chunking strategies better preserve structure and improve RAG answer correctness on financial QA benchmarks including the new TableQuest dataset.

Attention Grounded Enhancement for Visual Document Retrieval

cs.IR · 2025-11-17 · unverdicted · novelty 5.0

AGREE boosts visual document retrieval by adding local relevance signals from MLLM attention maps to global document labels during retriever training.

RELOOP: Recursive Retrieval with Multi-Hop Reasoner and Planners for Heterogeneous QA

cs.CL · 2025-10-23 · unverdicted · novelty 5.0

RELOOP unifies retrieval across text, tables, and KGs via hierarchical sequences and dual-agent guided iteration, reporting EM/F1 gains over baselines on HotpotQA, HybridQA/TAT-QA, and MetaQA.

Bridging Language Models and Financial Analysis

q-fin.ST · 2025-03-14 · unverdicted · novelty 2.0

A survey synthesizing recent LLM research and assessing its applicability to financial data analysis.

VT-Bench: A Unified Benchmark for Visual-Tabular Multi-Modal Learning

cs.CV · 2026-05-03

citing papers explorer

Showing 16 of 16 citing papers.

FIND: Toward Multimodal Financial Reasoning and Question Answering for Indic Languages cs.CL · 2026-05-13 · unverdicted · none · ref 6
FinVQA is a new multilingual benchmark for Indic financial VQA with three difficulty levels and four formats, paired with the FIND framework for faithful numerical reasoning via fine-tuning and constrained decoding.
Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain cs.CL · 2026-05-09 · unverdicted · none · ref 54
LLMs copy biased analyst ratings in investment decisions but a new detection method encourages independent reasoning and can improve stock return predictions beyond human levels.
INDOTABVQA: A Benchmark for Cross-Lingual Table Understanding in Bahasa Indonesia Documents cs.CV · 2026-04-13 · conditional · none · ref 2
INDOTABVQA is a new benchmark dataset for cross-lingual table visual question answering on Bahasa Indonesia documents that exposes VLM weaknesses on complex tables and low-resource languages while showing gains from fine-tuning and table region coordinates.
FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs cs.CL · 2025-10-10 · unverdicted · none · ref 32
FinAuditing is a taxonomy-structured multi-document benchmark with 1,102 instances averaging over 33k tokens from XBRL filings, defining three tasks to evaluate LLMs on financial auditing capabilities.
Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective cs.CL · 2025-05-27 · unverdicted · none · ref 15
MAMMQA is a multi-agent framework that decomposes multimodal queries, retrieves modality-specific answers, performs cross-modal synthesis with VLMs, and integrates results via an LLM to outperform single-model baselines on QA benchmarks.
FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information cs.CL · 2025-05-27 · unverdicted · none · ref 36
FinTagging decomposes XBRL tagging into FinNI extraction and FinCL full-taxonomy linking, showing LLMs handle extraction but struggle with fine-grained concept alignment in zero-shot settings.
FLARE: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding cs.CV · 2025-04-14 · unverdicted · none · ref 84
FLARE is a vision-language model family using text-guided vision encoding, context-aware alignment decoding, dual-semantic mapping loss, and text-driven VQA synthesis to achieve deep cross-modal integration, outperforming larger models with only 630 vision tokens at 3B scale.
Design and Report Benchmarks for Knowledge Work cs.AI · 2026-05-22 · unverdicted · none · ref 77
Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.
FINESSE-Bench: A Hierarchical Benchmark Suite for Financial Domain Knowledge and Technical Analysis in Large Language Models cs.CL · 2026-05-14 · unverdicted · none · ref 3 · 2 links
FINESSE-Bench is a new hierarchical benchmark suite combining certification-style exams, trading tasks, and a Russian olympiad set to evaluate LLMs on financial competencies at multiple difficulty levels.
The Power of Order: Fooling LLMs with Adversarial Table Permutations cs.LG · 2026-05-01 · unverdicted · none · ref 61 · 2 links
Semantically invariant row and column permutations in tables can cause LLMs to output incorrect answers, and a gradient-based attack called ATP efficiently finds such permutations that degrade performance across many models.
Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning cs.LG · 2026-04-23 · unverdicted · none · ref 2
TaNOS decouples table semantics from numerical structure via anonymization, sketches, and program-first self-supervision, yielding 80.13% FinQA accuracy with 10% data and near-zero cross-domain gap versus over 10pp for standard fine-tuning.
Empirical Evaluation of PDF Parsing and Chunking for Financial Question Answering with RAG cs.CL · 2026-04-13 · unverdicted · none · ref 52
Systematic tests show that specific PDF parsers combined with overlapping chunking strategies better preserve structure and improve RAG answer correctness on financial QA benchmarks including the new TableQuest dataset.
Attention Grounded Enhancement for Visual Document Retrieval cs.IR · 2025-11-17 · unverdicted · none · ref 65
AGREE boosts visual document retrieval by adding local relevance signals from MLLM attention maps to global document labels during retriever training.
RELOOP: Recursive Retrieval with Multi-Hop Reasoner and Planners for Heterogeneous QA cs.CL · 2025-10-23 · unverdicted · none · ref 34
RELOOP unifies retrieval across text, tables, and KGs via hierarchical sequences and dual-agent guided iteration, reporting EM/F1 gains over baselines on HotpotQA, HybridQA/TAT-QA, and MetaQA.
Bridging Language Models and Financial Analysis q-fin.ST · 2025-03-14 · unverdicted · none · ref 124
A survey synthesizing recent LLM research and assessing its applicability to financial data analysis.
VT-Bench: A Unified Benchmark for Visual-Tabular Multi-Modal Learning cs.CV · 2026-05-03 · unreviewed · ref 47

arXiv preprint arXiv:2105.07624 , year =

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer