Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration

Feng Ma; Jianqing Gao; Jianshu Zhang; Jiefeng Ma; Jun Du; Pengfei Hu; Quan Liu; Yicheng Pan; Zhenrong Zhang

arxiv: 2504.12773 · v1 · pith:IYL44EXBnew · submitted 2025-04-17 · 💻 cs.CL · cs.AI

Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration

Yicheng Pan , Zhenrong Zhang , Pengfei Hu , Jiefeng Ma , Jun Du , Jianshu Zhang , Quan Liu , Jianqing Gao

show 1 more author

Feng Ma

This is my paper

classification 💻 cs.CL cs.AI

keywords reasoninggeogenmllmssymboliclanguagemultimodalabilityapproach

0 comments

read the original abstract

Recent advances in Multimodal Large Language Models (MLLMs) have achieved remarkable progress in general domains and demonstrated promise in multimodal mathematical reasoning. However, applying MLLMs to geometry problem solving (GPS) remains challenging due to lack of accurate step-by-step solution data and severe hallucinations during reasoning. In this paper, we propose GeoGen, a pipeline that can automatically generates step-wise reasoning paths for geometry diagrams. By leveraging the precise symbolic reasoning, \textbf{GeoGen} produces large-scale, high-quality question-answer pairs. To further enhance the logical reasoning ability of MLLMs, we train \textbf{GeoLogic}, a Large Language Model (LLM) using synthetic data generated by GeoGen. Serving as a bridge between natural language and symbolic systems, GeoLogic enables symbolic tools to help verifying MLLM outputs, making the reasoning process more rigorous and alleviating hallucinations. Experimental results show that our approach consistently improves the performance of MLLMs, achieving remarkable results on benchmarks for geometric reasoning tasks. This improvement stems from our integration of the strengths of LLMs and symbolic systems, which enables a more reliable and interpretable approach for the GPS task. Codes are available at https://github.com/ycpNotFound/GeoGen.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Draw2Think: Harnessing Geometry Reasoning through Constraint Engine Interaction
cs.CV 2026-05 unverdicted novelty 7.0

Draw2Think recasts geometric reasoning as agentic interaction with a constraint engine, achieving 95.9% predicate-level construction fidelity and up to 16.4% accuracy gains on solid geometry tasks.
GeoLaux: A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines
cs.AI 2025-08 accept novelty 7.0

GeoLaux is a new benchmark of 2186 long-step geometry problems requiring auxiliary lines, used to evaluate 23 MLLMs and reveal major drops in performance on complex tasks.
Boosting MLLM Spatial Reasoning with Geometrically Referenced 3D Scene Representations
cs.CV 2026-03 unverdicted novelty 6.0

GR3D turns 3D scene geometry into ID-indexed text references, enabling zero-shot MLLM spatial reasoning gains of 9% on VSI-Bench and 12% on MindCube.
Concise Geometric Description as a Bridge: Unleashing the Potential of LLM for Plane Geometry Problem Solving
cs.AI 2026-01 unverdicted novelty 5.0

An MLLM interpreter generates concise CDL descriptions from diagrams, enabling an off-the-shelf LLM to solve plane geometry problems competitively after training on only 5.5k examples.