BiNSGPS: Geometry Problem Solving via Bidirectional Neuro-Symbolic Interaction

Cheng-Lin Liu; Fei Yin; Peijie Wang; Qi Wang

arxiv: 2606.04648 · v1 · pith:L7AH75THnew · submitted 2026-06-03 · 💻 cs.AI

BiNSGPS: Geometry Problem Solving via Bidirectional Neuro-Symbolic Interaction

Qi Wang , Peijie Wang , Fei Yin , Cheng-Lin Liu This is my paper

Pith reviewed 2026-06-28 06:16 UTC · model grok-4.3

classification 💻 cs.AI

keywords geometry problem solvingneuro-symbolic interactionbidirectional feedbackmultimodal large language modelsymbolic solverformal representation correction

0 comments

The pith

Bidirectional feedback lets a multimodal LLM adviser correct formal representations for a symbolic geometry solver.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that existing neuro-symbolic geometry solvers break when an early neural step produces an inconsistent formal representation because information flows only one way. BiNSGPS creates a loop in which the symbolic solver sends concrete feedback back to the MLLM adviser, which then either repairs the representation or adds auxiliary hypotheses. This interaction is presented as the mechanism that resolves conflicts and completes deductions that unidirectional pipelines cannot finish. A sympathetic reader cares because the approach tries to keep the adaptability of neural models while retaining the reliability of symbolic execution.

Core claim

We propose BiNSGPS, a framework that establishes Bidirectional Neuro-Symbolic Interaction (BiNS) between a MLLM Adviser and a Symbolic Solver. MLLM Adviser actively incorporates feedback from the symbolic solver to dynamically rectify inconsistent formal representations or propose auxiliary hypotheses, resolving symbolic conflicts and facilitating complex deductions.

What carries the argument

Bidirectional Neuro-Symbolic Interaction (BiNS): the MLLM Adviser receives and acts on concrete solver feedback to repair or augment the formal representation passed to the solver.

If this is right

Early-stage neural parsing errors no longer force the entire solution to fail.
Symbolic conflicts can trigger targeted neural hypothesis generation instead of halting.
Complex multi-step deductions become reachable through iterative correction rather than single-pass correctness.
The system gains robustness to variations in diagram or text input that would otherwise produce inconsistent formalizations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same feedback loop could be applied to other domains that combine neural parsing with symbolic execution, such as algebraic word problems or logical entailment.
If the adviser's corrections prove reliable, the need for exhaustive upfront diagram parsing decreases.
Repeated interaction rounds might surface previously hidden auxiliary lemmas that a one-shot pipeline would miss.

Load-bearing premise

The MLLM adviser can reliably interpret solver feedback and produce corrected representations or useful hypotheses without creating new inconsistencies.

What would settle it

A controlled set of geometry problems in which the initial neural formalization contains detectable errors; measure whether the bidirectional loop produces correct final solutions more often than a unidirectional baseline on the same inputs.

Figures

Figures reproduced from arXiv: 2606.04648 by Cheng-Lin Liu, Fei Yin, Peijie Wang, Qi Wang.

**Figure 2.** Figure 2: Framework of BiNSGPS pipeline. Top: Multimodal Representations Alignment. Geometry diagram-text pairs first get annotation. Then the pair are processed through a dual-parser architecture where a specialist neural network extracts structural diagram primitives while an MLLM parses textual constraints. These are integrated into initial logic forms L, which are aligned and completed to form a comprehensive sy… view at source ↗

**Figure 3.** Figure 3: Performance in MathVista (a) and Adviser [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Failure cases of current methods. Qwen3-VL-Plus exhibits hallucination induced errors during [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Success case of BiNSGPS’s Rectify Inconsistent Representations. [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Success case of BiNSGPS’s Propose Auxiliary Hypotheses. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Fail cases of BiNSGPS. Including correct results but wrong steps case (up) and fail case (bottom) [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

read the original abstract

Geometry problem solving poses distinct challenges in artificial intelligence. Existing approaches typically fall into two paradigms: symbolic methods, which exhibit limited adaptability, and neural methods, which are prone to hallucinations. Recent neuro-symbolic hybrids predominantly rely on a unidirectional pipeline where neural outputs are fed into solvers without feedback, making system brittle to early-stage errors. To break this unidirectional bottleneck, we propose BiNSGPS, a framework that establishes Bidirectional Neuro-Symbolic Interaction (BiNS) between a MLLM Adviser and a Symbolic Solver. MLLM Adviser actively incorporates feedback from the symbolic solver to dynamically rectify inconsistent formal representations or propose auxiliary hypotheses, resolving symbolic conflicts and facilitating complex deductions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BiNSGPS is a high-level proposal for bidirectional MLLM-symbolic feedback in geometry solving, but the abstract supplies no implementation details or results to back the central claim.

read the letter

The main takeaway is that this paper proposes BiNSGPS, a framework with bidirectional interaction between an MLLM adviser and a symbolic solver for geometry problems. The MLLM is meant to take solver feedback on inconsistent representations or conflicts and either fix them or generate auxiliary hypotheses.

It does a reasonable job of naming the brittleness in existing unidirectional neuro-symbolic pipelines, where early neural mistakes break the whole system. The bidirectional loop is the element presented as new.

The soft spot is substantial and sits right at the core. The abstract gives no architecture, no prompt templates, no fine-tuning approach, and no error analysis for how the MLLM would actually use symbolic feedback without injecting new inconsistencies. Geometry formal languages are precise, and current MLLMs are known to struggle with exact symbolic edits even with guidance. Without any experiments, ablations, or even a worked example, there is no evidence the loop delivers the claimed benefit.

The work is a framework sketch rather than a completed result. It would interest people already working on neuro-symbolic hybrids for math reasoning or education tools, but readers wanting reproducible methods or performance numbers will find little to use.

I would not send this to peer review in its current state. It needs at least a concrete implementation and some initial results before it is ready for serious refereeing.

Referee Report

3 major / 0 minor

Summary. The manuscript proposes BiNSGPS, a framework for geometry problem solving that introduces Bidirectional Neuro-Symbolic Interaction (BiNS) between an MLLM Adviser and a Symbolic Solver. The MLLM Adviser incorporates feedback from the solver to dynamically rectify inconsistent formal representations or propose auxiliary hypotheses, aiming to overcome the brittleness of unidirectional neuro-symbolic pipelines.

Significance. If the bidirectional interaction can be made reliable, the framework could advance neuro-symbolic methods for mathematical reasoning by enabling correction of early errors and supporting complex deductions in geometry, where unidirectional approaches often fail due to unrecoverable mistakes.

major comments (3)

[Abstract] Abstract: The central claim that the MLLM Adviser can reliably interpret solver feedback to produce corrected formal representations or useful auxiliary hypotheses without introducing new inconsistencies is presented at a high level only, with no architecture, prompt templates, fine-tuning procedure, or error analysis supplied to substantiate the capability.
[Abstract] Abstract: No experimental results, ablation studies, benchmarks on geometry datasets, or case studies are provided to demonstrate that the bidirectional loop outperforms unidirectional pipelines or resolves symbolic conflicts effectively; the claimed benefit therefore rests on an unshown performance.
[Abstract] Abstract: The description of how the BiNS interaction resolves conflicts or facilitates deductions lacks any formal specification, pseudocode, or interface definition between the MLLM and symbolic components, making it impossible to assess feasibility or potential for circularity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for greater detail and substantiation in our presentation of BiNSGPS. We agree that the current manuscript is primarily conceptual and will expand the relevant sections with the requested elements in the revised version.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the MLLM Adviser can reliably interpret solver feedback to produce corrected formal representations or useful auxiliary hypotheses without introducing new inconsistencies is presented at a high level only, with no architecture, prompt templates, fine-tuning procedure, or error analysis supplied to substantiate the capability.

Authors: We acknowledge that the abstract and initial description remain high-level. The full manuscript elaborates the MLLM Adviser architecture in Section 3, including the bidirectional feedback loop. To directly address the concern, we will incorporate example prompt templates, a description of any fine-tuning, and an error analysis of feedback interpretation in the revised manuscript. revision: yes
Referee: [Abstract] Abstract: No experimental results, ablation studies, benchmarks on geometry datasets, or case studies are provided to demonstrate that the bidirectional loop outperforms unidirectional pipelines or resolves symbolic conflicts effectively; the claimed benefit therefore rests on an unshown performance.

Authors: The submitted manuscript presents the BiNSGPS framework and its motivation but does not yet include empirical evaluation. We will add experimental results, ablation studies, benchmarks on standard geometry datasets, and case studies in the revised version to demonstrate the advantages of bidirectional interaction over unidirectional baselines. revision: yes
Referee: [Abstract] Abstract: The description of how the BiNS interaction resolves conflicts or facilitates deductions lacks any formal specification, pseudocode, or interface definition between the MLLM and symbolic components, making it impossible to assess feasibility or potential for circularity.

Authors: We agree that a formal specification is necessary for assessing the interaction. The manuscript outlines the high-level BiNS mechanism but lacks pseudocode and explicit interface definitions. In revision we will add pseudocode for the bidirectional loop, a precise interface specification, and a brief discussion of safeguards against circularity. revision: yes

Circularity Check

0 steps flagged

No derivation chain present; framework proposal only

full rationale

The manuscript describes an architectural framework (BiNS) for bidirectional interaction between an MLLM Adviser and a Symbolic Solver in geometry problem solving. No equations, parameters, predictions, or first-principles derivations appear in the provided text. The central claim is a system design whose correctness rests on empirical performance rather than any closed-form result that could reduce to its own inputs by construction. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results are identifiable. This matches the default expectation for non-circular papers; the work is self-contained as a proposal without mathematical self-reference.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no information on free parameters, background axioms, or new postulated entities.

pith-pipeline@v0.9.1-grok · 5642 in / 1166 out tokens · 22281 ms · 2026-06-28T06:16:15.866869+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

71 extracted references · 14 canonical work pages

[1]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

1972
[2]

Publications Manual , year = "1983", publisher =

1983
[3]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981
[4]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of
[5]

Dan Gusfield , title =. 1997

1997
[6]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

2015
[7]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =
[8]

Solving Olympiad Geometry without Human Demonstrations , year =

Trinh, Trieu and Wu, Yuhuai and Le, Quoc and He, He and Luong, Thang , journal =. Solving Olympiad Geometry without Human Demonstrations , year =
[9]

2025 , eprint=

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 , author=. 2025 , eprint=

2025
[10]

arXiv preprint arXiv:2505.21177 , year=

Solidgeo: Measuring multimodal spatial math reasoning in solid geometry , author=. arXiv preprint arXiv:2505.21177 , year=

arXiv
[11]

2025 , eprint=

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning , author=. 2025 , eprint=

2025
[12]

Inter-gps: Interpretable geometry problem solving with formal language and symbolic reasoning , author=. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , pages=
[13]

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , pages=

Geoqa: A geometric question answering benchmark towards multimodal numerical reasoning , author=. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , pages=

2021
[14]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Mv-math: Evaluating multimodal math reasoning in multi-visual contexts , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[15]

LANS : A Layout-Aware Neural Solver for Plane Geometry Problem

Li, Zhong-Zhi and Zhang, Ming-Liang and Yin, Fei and Liu, Cheng-Lin. LANS : A Layout-Aware Neural Solver for Plane Geometry Problem. Findings of the Association for Computational Linguistics: ACL 2024. 2024. doi:10.18653/v1/2024.findings-acl.153

work page doi:10.18653/v1/2024.findings-acl.153 2024
[16]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence,

Plane Geometry Diagram Parsing , author =. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence,. 2022 , month =. doi:10.24963/ijcai.2022/228 , url =

work page doi:10.24963/ijcai.2022/228 2022
[17]

2025 , eprint=

Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration , author=. 2025 , eprint=

2025
[18]

arXiv preprint arXiv:2504.15780 , year=

TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving , author=. arXiv preprint arXiv:2504.15780 , year=

arXiv
[19]

arXiv preprint arXiv:2312.11370 , year=

G-llava: Solving geometric problem with multi-modal large language model , author=. arXiv preprint arXiv:2312.11370 , year=

arXiv
[20]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

Math-llava: Bootstrapping mathematical reasoning for multimodal large language models , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

2024
[21]

2025 , eprint=

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models , author=. 2025 , eprint=

2025
[22]

arXiv preprint arXiv:2511.21631 , year=

Qwen3-vl technical report , author=. arXiv preprint arXiv:2511.21631 , year=

Pith/arXiv arXiv
[23]

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset , url =

Wang, Ke and Pan, Junting and Shi, Weikang and Lu, Zimu and Ren, Houxing and Zhou, Aojun and Zhan, Mingjie and Li, Hongsheng , booktitle =. Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset , url =. doi:10.52202/079017-3014 , pages =

work page doi:10.52202/079017-3014
[24]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence , pages=

Neuro-symbolic artificial intelligence: towards improving the reasoning abilities of large language models , author=. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence , pages=
[25]

Proceedings of the 2015 conference on empirical methods in natural language processing , pages=

Solving geometry problems: Combining text and diagram interpretation , author=. Proceedings of the 2015 conference on empirical methods in natural language processing , pages=

2015
[26]

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator , year=

Wu, Wenjun and Zhang, Lingling and Liu, Jun and Tang, Xi and Wang, Yaxian and Wang, Shaowei and Wang, Qianying , booktitle=. E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator , year=
[27]

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving , url =

Gou, Zhibin and Shao, Zhihong and Gong, Yeyun and shen, yelong and Yang, Yujiu and Huang, Minlie and Duan, Nan and Chen, Weizhu , booktitle =. ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving , url =
[28]

Liu, Mingyue and Ueda, Ryo and Wan, Zhen and Inoue, Katsumi and Willcocks, Chris G. , year=. Neuro-Symbolic Contrastive Learning for Cross-domain Inference , volume=. doi:10.4204/eptcs.416.6 , journal=

work page doi:10.4204/eptcs.416.6
[29]

Neuro-symbolic Training for Reasoning over Spatial Language , url=

Premsri, Tanawan and Kordjamshidi, Parisa , year=. Neuro-symbolic Training for Reasoning over Spatial Language , url=. doi:10.18653/v1/2025.findings-naacl.128 , booktitle=

work page doi:10.18653/v1/2025.findings-naacl.128 2025
[30]

2024 , eprint=

Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models , author=. 2024 , eprint=

2024
[31]

2024 , eprint=

Proposing and solving olympiad geometry with guided tree search , author=. 2024 , eprint=

2024
[32]

2024 , eprint=

GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation , author=. 2024 , eprint=

2024
[33]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence,

A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram , author =. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence,. 2023 , month =. doi:10.24963/ijcai.2023/376 , url =

work page doi:10.24963/ijcai.2023/376 2023
[34]

Autoformalization with Large Language Models , url =

Wu, Yuhuai and Jiang, Albert Qiaochu and Li, Wenda and Rabe, Markus and Staats, Charles and Jamnik, Mateja and Szegedy, Christian , booktitle =. Autoformalization with Large Language Models , url =
[35]

ViperGPT: Visual Inference via Python Execution for Reasoning , booktitle =

Sur. ViperGPT: Visual Inference via Python Execution for Reasoning , booktitle =. 2023 , pages =

2023
[36]

Terufumi Morishita, Gaku Morio, Atsuki Yamaguchi, and Yasuhiro Sogawa

Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William. Logic- LM : Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.248

work page doi:10.18653/v1/2023.findings-emnlp.248 2023
[37]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,

Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models , author =. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,. 2025 , month =. doi:10.24963/ijcai.2025/1195 , url =

work page doi:10.24963/ijcai.2025/1195 2025
[38]

2024 , eprint=

Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram , author=. 2024 , eprint=

2024
[39]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

Geoeval: benchmark for evaluating llms and multi-modal models on geometry problem-solving , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

2024
[40]

2024 , eprint=

FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving , author=. 2024 , eprint=

2024
[41]

Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver , year=

Zhang, Zeren and Cheng, Jo-Ku and Deng, Jingyang and Tian, Lu and Ma, Jinwen and Qin, Ziran and Zhang, Xiaokai and Zhu, Na and Leng, Tuo , booktitle=. Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver , year=
[42]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,

FGeo-HyperGNet: Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network , author =. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,. 2025 , month =. doi:10.24963/ijcai.2025/527 , url =

work page doi:10.24963/ijcai.2025/527 2025
[43]

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training , author=
[44]

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts , url =

Lu, Pan and Bansal, Hritik and Xia, Tony and Liu, Jiacheng and Li, Chunyuan and Hajishirzi, Hannaneh and Cheng, Hao and Chang, Kai-Wei and Galley, Michel and Gao, Jianfeng , booktitle =. MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts , url =
[45]

2024 , eprint=

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset , author=. 2024 , eprint=

2024
[46]

2025 , eprint=

GeoLaux: A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines , author=. 2025 , eprint=

2025
[47]

2025 , eprint=

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers , author=. 2025 , eprint=

2025
[48]

2025 , eprint=

Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens , author=. 2025 , eprint=

2025
[49]

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models , url =

Hu, Yushi and Shi, Weijia and Fu, Xingyu and Roth, Dan and Ostendorf, Mari and Zettlemoyer, Luke and Smith, Noah A and Krishna, Ranjay , booktitle =. Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models , url =. doi:10.52202/079017-4423 , editor =

work page doi:10.52202/079017-4423
[50]

2025 , eprint=

CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images , author=. 2025 , eprint=

2025
[51]

2025 , eprint=

GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation , author=. 2025 , eprint=

2025
[52]

2026 , eprint=

GeoVLMath: Enhancing Geometry Reasoning in Vision-Language Models via Cross-Modal Reward for Auxiliary Line Creation , author=. 2026 , eprint=

2026
[53]

2026 , eprint=

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models , author=. 2026 , eprint=

2026
[54]

G eo DRL : A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning

Peng, Shuai and Fu, Di and Liang, Yijun and Gao, Liangcai and Tang, Zhi. G eo DRL : A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning. Findings of the Association for Computational Linguistics: ACL 2023. 2023. doi:10.18653/v1/2023.findings-acl.850

work page doi:10.18653/v1/2023.findings-acl.850 2023
[55]

2022 , eprint=

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression , author=. 2022 , eprint=

2022
[56]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Gns: Solving plane geometry problems by neural-symbolic reasoning with multi-modal llms , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[57]

Automatic understanding and formalization of natural language geometry problems using syntax-semantics models , volume =

Gan, Wenbin and Yu, Xinguo , year =. Automatic understanding and formalization of natural language geometry problems using syntax-semantics models , volume =. International Journal of Innovative Computing, Information and Control , doi =
[58]

2024 , eprint=

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5\ author=. 2024 , eprint=

2024
[59]

2024 , eprint=

AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding , author=. 2024 , eprint=

2024
[60]

Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency , url =

Li, Zenan and Wu, Yifan and Li, Zhaoyu and Wei, Xinming and Yang, Fan and Zhang, Xian and Ma, Xiaoxing , booktitle =. Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency , url =. doi:10.52202/079017-1697 , editor =

work page doi:10.52202/079017-1697
[61]

2024 , eprint=

GPT-4o System Card , author=. 2024 , eprint=

2024
[62]

2025 , eprint=

Qwen2.5 Technical Report , author=. 2025 , eprint=

2025
[63]

2024 , eprint=

Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models , author=. 2024 , eprint=

2024
[64]

2024 , url =

Xu, Fangzhi and Wu, Zhiyong and Sun, Qiushi and Ren, Siyu and Yuan, Fei and Yuan, Shuai and Lin, Qika and Qiao, Yu and Liu, Jun. Symbol- LLM : Towards Foundational Symbol-centric Interface For Large Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.a...

work page doi:10.18653/v1/2024.acl-long.707 2024
[65]

2024 , eprint=

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models , author=. 2024 , eprint=

2024
[66]

2025 , eprint=

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning , author=. 2025 , eprint=

2025
[67]

2024 , eprint=

GPT-4 Technical Report , author=. 2024 , eprint=

2024
[68]

2025 , eprint=

Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information , author=. 2025 , eprint=

2025
[69]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Recoverable compression: A multimodal vision token recovery mechanism guided by text information , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[70]

An Open-Ended Benchmark and Formal Framework for Adjuvant Research with

yi chen and Yu Zhang and Jian Xu and Hua Yue and Xinming Wang and Zequan Lyu and Xu-Yao Zhang and Wei Wei and Cheng-Lin Liu , booktitle=. An Open-Ended Benchmark and Formal Framework for Adjuvant Research with. 2026 , url=

2026
[71]

arXiv preprint arXiv:2604.11600 , year=

Geoparsing: Diagram Parsing for Plane and Solid Geometry with a Unified Formal Language , author=. arXiv preprint arXiv:2604.11600 , year=

Pith/arXiv arXiv

[1] [1]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

1972

[2] [2]

Publications Manual , year = "1983", publisher =

1983

[3] [3]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981

[4] [4]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

[5] [5]

Dan Gusfield , title =. 1997

1997

[6] [6]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

2015

[7] [7]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

[8] [8]

Solving Olympiad Geometry without Human Demonstrations , year =

Trinh, Trieu and Wu, Yuhuai and Le, Quoc and He, He and Luong, Thang , journal =. Solving Olympiad Geometry without Human Demonstrations , year =

[9] [9]

2025 , eprint=

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 , author=. 2025 , eprint=

2025

[10] [10]

arXiv preprint arXiv:2505.21177 , year=

Solidgeo: Measuring multimodal spatial math reasoning in solid geometry , author=. arXiv preprint arXiv:2505.21177 , year=

arXiv

[11] [11]

2025 , eprint=

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning , author=. 2025 , eprint=

2025

[12] [12]

Inter-gps: Interpretable geometry problem solving with formal language and symbolic reasoning , author=. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , pages=

[13] [13]

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , pages=

Geoqa: A geometric question answering benchmark towards multimodal numerical reasoning , author=. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , pages=

2021

[14] [14]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Mv-math: Evaluating multimodal math reasoning in multi-visual contexts , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[15] [15]

LANS : A Layout-Aware Neural Solver for Plane Geometry Problem

Li, Zhong-Zhi and Zhang, Ming-Liang and Yin, Fei and Liu, Cheng-Lin. LANS : A Layout-Aware Neural Solver for Plane Geometry Problem. Findings of the Association for Computational Linguistics: ACL 2024. 2024. doi:10.18653/v1/2024.findings-acl.153

work page doi:10.18653/v1/2024.findings-acl.153 2024

[16] [16]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence,

Plane Geometry Diagram Parsing , author =. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence,. 2022 , month =. doi:10.24963/ijcai.2022/228 , url =

work page doi:10.24963/ijcai.2022/228 2022

[17] [17]

2025 , eprint=

Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration , author=. 2025 , eprint=

2025

[18] [18]

arXiv preprint arXiv:2504.15780 , year=

TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving , author=. arXiv preprint arXiv:2504.15780 , year=

arXiv

[19] [19]

arXiv preprint arXiv:2312.11370 , year=

G-llava: Solving geometric problem with multi-modal large language model , author=. arXiv preprint arXiv:2312.11370 , year=

arXiv

[20] [20]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

Math-llava: Bootstrapping mathematical reasoning for multimodal large language models , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

2024

[21] [21]

2025 , eprint=

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models , author=. 2025 , eprint=

2025

[22] [22]

arXiv preprint arXiv:2511.21631 , year=

Qwen3-vl technical report , author=. arXiv preprint arXiv:2511.21631 , year=

Pith/arXiv arXiv

[23] [23]

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset , url =

Wang, Ke and Pan, Junting and Shi, Weikang and Lu, Zimu and Ren, Houxing and Zhou, Aojun and Zhan, Mingjie and Li, Hongsheng , booktitle =. Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset , url =. doi:10.52202/079017-3014 , pages =

work page doi:10.52202/079017-3014

[24] [24]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence , pages=

Neuro-symbolic artificial intelligence: towards improving the reasoning abilities of large language models , author=. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence , pages=

[25] [25]

Proceedings of the 2015 conference on empirical methods in natural language processing , pages=

Solving geometry problems: Combining text and diagram interpretation , author=. Proceedings of the 2015 conference on empirical methods in natural language processing , pages=

2015

[26] [26]

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator , year=

Wu, Wenjun and Zhang, Lingling and Liu, Jun and Tang, Xi and Wang, Yaxian and Wang, Shaowei and Wang, Qianying , booktitle=. E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator , year=

[27] [27]

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving , url =

Gou, Zhibin and Shao, Zhihong and Gong, Yeyun and shen, yelong and Yang, Yujiu and Huang, Minlie and Duan, Nan and Chen, Weizhu , booktitle =. ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving , url =

[28] [28]

Liu, Mingyue and Ueda, Ryo and Wan, Zhen and Inoue, Katsumi and Willcocks, Chris G. , year=. Neuro-Symbolic Contrastive Learning for Cross-domain Inference , volume=. doi:10.4204/eptcs.416.6 , journal=

work page doi:10.4204/eptcs.416.6

[29] [29]

Neuro-symbolic Training for Reasoning over Spatial Language , url=

Premsri, Tanawan and Kordjamshidi, Parisa , year=. Neuro-symbolic Training for Reasoning over Spatial Language , url=. doi:10.18653/v1/2025.findings-naacl.128 , booktitle=

work page doi:10.18653/v1/2025.findings-naacl.128 2025

[30] [30]

2024 , eprint=

Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models , author=. 2024 , eprint=

2024

[31] [31]

2024 , eprint=

Proposing and solving olympiad geometry with guided tree search , author=. 2024 , eprint=

2024

[32] [32]

2024 , eprint=

GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation , author=. 2024 , eprint=

2024

[33] [33]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence,

A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram , author =. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence,. 2023 , month =. doi:10.24963/ijcai.2023/376 , url =

work page doi:10.24963/ijcai.2023/376 2023

[34] [34]

Autoformalization with Large Language Models , url =

Wu, Yuhuai and Jiang, Albert Qiaochu and Li, Wenda and Rabe, Markus and Staats, Charles and Jamnik, Mateja and Szegedy, Christian , booktitle =. Autoformalization with Large Language Models , url =

[35] [35]

ViperGPT: Visual Inference via Python Execution for Reasoning , booktitle =

Sur. ViperGPT: Visual Inference via Python Execution for Reasoning , booktitle =. 2023 , pages =

2023

[36] [36]

Terufumi Morishita, Gaku Morio, Atsuki Yamaguchi, and Yasuhiro Sogawa

Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William. Logic- LM : Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.248

work page doi:10.18653/v1/2023.findings-emnlp.248 2023

[37] [37]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,

Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models , author =. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,. 2025 , month =. doi:10.24963/ijcai.2025/1195 , url =

work page doi:10.24963/ijcai.2025/1195 2025

[38] [38]

2024 , eprint=

Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram , author=. 2024 , eprint=

2024

[39] [39]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

Geoeval: benchmark for evaluating llms and multi-modal models on geometry problem-solving , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

2024

[40] [40]

2024 , eprint=

FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving , author=. 2024 , eprint=

2024

[41] [41]

Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver , year=

Zhang, Zeren and Cheng, Jo-Ku and Deng, Jingyang and Tian, Lu and Ma, Jinwen and Qin, Ziran and Zhang, Xiaokai and Zhu, Na and Leng, Tuo , booktitle=. Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver , year=

[42] [42]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,

FGeo-HyperGNet: Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network , author =. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,. 2025 , month =. doi:10.24963/ijcai.2025/527 , url =

work page doi:10.24963/ijcai.2025/527 2025

[43] [43]

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training , author=

[44] [44]

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts , url =

Lu, Pan and Bansal, Hritik and Xia, Tony and Liu, Jiacheng and Li, Chunyuan and Hajishirzi, Hannaneh and Cheng, Hao and Chang, Kai-Wei and Galley, Michel and Gao, Jianfeng , booktitle =. MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts , url =

[45] [45]

2024 , eprint=

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset , author=. 2024 , eprint=

2024

[46] [46]

2025 , eprint=

GeoLaux: A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines , author=. 2025 , eprint=

2025

[47] [47]

2025 , eprint=

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers , author=. 2025 , eprint=

2025

[48] [48]

2025 , eprint=

Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens , author=. 2025 , eprint=

2025

[49] [49]

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models , url =

Hu, Yushi and Shi, Weijia and Fu, Xingyu and Roth, Dan and Ostendorf, Mari and Zettlemoyer, Luke and Smith, Noah A and Krishna, Ranjay , booktitle =. Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models , url =. doi:10.52202/079017-4423 , editor =

work page doi:10.52202/079017-4423

[50] [50]

2025 , eprint=

CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images , author=. 2025 , eprint=

2025

[51] [51]

2025 , eprint=

GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation , author=. 2025 , eprint=

2025

[52] [52]

2026 , eprint=

GeoVLMath: Enhancing Geometry Reasoning in Vision-Language Models via Cross-Modal Reward for Auxiliary Line Creation , author=. 2026 , eprint=

2026

[53] [53]

2026 , eprint=

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models , author=. 2026 , eprint=

2026

[54] [54]

G eo DRL : A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning

Peng, Shuai and Fu, Di and Liang, Yijun and Gao, Liangcai and Tang, Zhi. G eo DRL : A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning. Findings of the Association for Computational Linguistics: ACL 2023. 2023. doi:10.18653/v1/2023.findings-acl.850

work page doi:10.18653/v1/2023.findings-acl.850 2023

[55] [55]

2022 , eprint=

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression , author=. 2022 , eprint=

2022

[56] [56]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Gns: Solving plane geometry problems by neural-symbolic reasoning with multi-modal llms , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[57] [57]

Automatic understanding and formalization of natural language geometry problems using syntax-semantics models , volume =

Gan, Wenbin and Yu, Xinguo , year =. Automatic understanding and formalization of natural language geometry problems using syntax-semantics models , volume =. International Journal of Innovative Computing, Information and Control , doi =

[58] [58]

2024 , eprint=

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5\ author=. 2024 , eprint=

2024

[59] [59]

2024 , eprint=

AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding , author=. 2024 , eprint=

2024

[60] [60]

Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency , url =

Li, Zenan and Wu, Yifan and Li, Zhaoyu and Wei, Xinming and Yang, Fan and Zhang, Xian and Ma, Xiaoxing , booktitle =. Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency , url =. doi:10.52202/079017-1697 , editor =

work page doi:10.52202/079017-1697

[61] [61]

2024 , eprint=

GPT-4o System Card , author=. 2024 , eprint=

2024

[62] [62]

2025 , eprint=

Qwen2.5 Technical Report , author=. 2025 , eprint=

2025

[63] [63]

2024 , eprint=

Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models , author=. 2024 , eprint=

2024

[64] [64]

2024 , url =

Xu, Fangzhi and Wu, Zhiyong and Sun, Qiushi and Ren, Siyu and Yuan, Fei and Yuan, Shuai and Lin, Qika and Qiao, Yu and Liu, Jun. Symbol- LLM : Towards Foundational Symbol-centric Interface For Large Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.a...

work page doi:10.18653/v1/2024.acl-long.707 2024

[65] [65]

2024 , eprint=

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models , author=. 2024 , eprint=

2024

[66] [66]

2025 , eprint=

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning , author=. 2025 , eprint=

2025

[67] [67]

2024 , eprint=

GPT-4 Technical Report , author=. 2024 , eprint=

2024

[68] [68]

2025 , eprint=

Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information , author=. 2025 , eprint=

2025

[69] [69]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Recoverable compression: A multimodal vision token recovery mechanism guided by text information , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[70] [70]

An Open-Ended Benchmark and Formal Framework for Adjuvant Research with

yi chen and Yu Zhang and Jian Xu and Hua Yue and Xinming Wang and Zequan Lyu and Xu-Yao Zhang and Wei Wei and Cheng-Lin Liu , booktitle=. An Open-Ended Benchmark and Formal Framework for Adjuvant Research with. 2026 , url=

2026

[71] [71]

arXiv preprint arXiv:2604.11600 , year=

Geoparsing: Diagram Parsing for Plane and Solid Geometry with a Unified Formal Language , author=. arXiv preprint arXiv:2604.11600 , year=

Pith/arXiv arXiv