Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning

Xinglang Zhang , Yunyao Zhang , Zeliang Chen , Junqing Yu , Wei Yang , Zikai Song

Authors on Pith no claims yet

classification 💻 cs.AI cs.CLcs.LO

keywords logicalreasoningcriticalphasetransitionsbeyondcollapsecomplexity

read the original abstract

Symbolic logical reasoning is a critical yet underexplored capability of large language models (LLMs), providing reliable and verifiable decision-making in high-stakes domains such as mathematical reasoning and legal judgment. In this study, we present a systematic analysis of logical reasoning under controlled increases in logical complexity, and reveal a previously unrecognized phenomenon, which we term Logical Phase Transitions: rather than degrading smoothly, logical reasoning performance remains stable within a regime but collapses abruptly beyond a critical logical depth, mirroring physical phase transitions such as water freezing beyond a critical temperature threshold. Building on this insight, we propose Neuro-Symbolic Curriculum Tuning, a principled framework that adaptively aligns natural language with logical symbols to establish a shared representation, and reshapes training dynamics around phase-transition boundaries to progressively strengthen reasoning at increasing logical depths. Experiments on five benchmarks show that our approach effectively mitigates logical reasoning collapse at high complexity, yielding average accuracy gains of +1.26 in naive prompting and +3.95 in CoT, while improving generalization to unseen logical compositions. Code and data are available at https://github.com/AI4SS/Logical-Phase-Transitions.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics
cs.SI 2026-04 unverdicted novelty 7.0

IntervenSim is an intervention-aware social network simulation that couples source interventions with crowd interactions in a feedback loop, improving MAPE by 41.6% and DTW by 66.9% over prior static frameworks on rea...
OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction
cs.CV 2026-04 unverdicted novelty 6.0

OmniTrend predicts popularity by combining separate content attractiveness and contextual exposure predictors using cross-modal and exogenous signals.
HotComment: A Benchmark for Evaluating Popularity of Online Comments
cs.AI 2026-04 unverdicted novelty 6.0

HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylist...
Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction
cs.MM 2026-04 unverdicted novelty 5.0

A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.
ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing
cs.SD 2026-04 unverdicted novelty 5.0

ActorMind is a four-agent chain-of-thought framework that emulates human actors to produce spontaneous, emotion-infused speech responses for role-playing scenarios.
CurEvo: Curriculum-Guided Self-Evolution for Video Understanding
cs.CV 2026-04 unverdicted novelty 4.0

CurEvo integrates curriculum guidance into self-evolution to structure autonomous improvement of video understanding models, yielding gains on VideoQA benchmarks.