V-tableR1 uses a critic VLM for dense step-level feedback and a new PGPO algorithm to shift multimodal table reasoning from pattern matching to verifiable logical steps, achieving SOTA accuracy with a 4B open-source model.
arXiv preprint arXiv:2505.12415 (2025)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
SpreadsheetAgent uses incremental multi-format reading, structural sketching, and verification to raise spreadsheet benchmark accuracy from 35.27% to 38.16%.
citing papers explorer
-
V-tableR1: Process-Supervised Multimodal Table Reasoning with Critic-Guided Policy Optimization
V-tableR1 uses a critic VLM for dense step-level feedback and a new PGPO algorithm to shift multimodal table reasoning from pattern matching to verifiable logical steps, achieving SOTA accuracy with a 4B open-source model.
-
Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning
SpreadsheetAgent uses incremental multi-format reading, structural sketching, and verification to raise spreadsheet benchmark accuracy from 35.27% to 38.16%.