FinCriticalED benchmark reveals that OCR and MLLM systems frequently fail to preserve critical financial facts such as numbers and monetary units even when lexical accuracy is high.
Slow perception: Let’s perceive geometric figures step-by-step
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
OMIBench benchmark reveals that current LVLMs achieve at most 50% on Olympiad problems requiring reasoning across multiple images.
DeepSeek-OCR compresses text contexts up to 20x via 2D optical mapping while achieving 97% OCR accuracy below 10x and 60% at 20x, outperforming prior OCR tools with fewer vision tokens.
An MLLM interpreter generates concise CDL descriptions from diagrams, enabling an off-the-shelf LLM to solve plane geometry problems competitively after training on only 5.5k examples.
The survey organizes the shift of LLMs toward deliberate System 2 reasoning, covering model construction techniques, performance on math and coding benchmarks, and future research directions.
citing papers explorer
-
FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR
FinCriticalED benchmark reveals that OCR and MLLM systems frequently fail to preserve critical financial facts such as numbers and monetary units even when lexical accuracy is high.
-
OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model
OMIBench benchmark reveals that current LVLMs achieve at most 50% on Olympiad problems requiring reasoning across multiple images.
-
DeepSeek-OCR: Contexts Optical Compression
DeepSeek-OCR compresses text contexts up to 20x via 2D optical mapping while achieving 97% OCR accuracy below 10x and 60% at 20x, outperforming prior OCR tools with fewer vision tokens.
-
Concise Geometric Description as a Bridge: Unleashing the Potential of LLM for Plane Geometry Problem Solving
An MLLM interpreter generates concise CDL descriptions from diagrams, enabling an off-the-shelf LLM to solve plane geometry problems competitively after training on only 5.5k examples.
-
From System 1 to System 2: A Survey of Reasoning Large Language Models
The survey organizes the shift of LLMs toward deliberate System 2 reasoning, covering model construction techniques, performance on math and coding benchmarks, and future research directions.