Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning

· 2026 · cs.AI · arXiv 2601.02902

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

Symbolic logical reasoning is a critical yet underexplored capability of large language models (LLMs), providing reliable and verifiable decision-making in high-stakes domains such as mathematical reasoning and legal judgment. In this study, we present a systematic analysis of logical reasoning under controlled increases in logical complexity, and reveal a previously unrecognized phenomenon, which we term Logical Phase Transitions: rather than degrading smoothly, logical reasoning performance remains stable within a regime but collapses abruptly beyond a critical logical depth, mirroring physical phase transitions such as water freezing beyond a critical temperature threshold. Building on this insight, we propose Neuro-Symbolic Curriculum Tuning, a principled framework that adaptively aligns natural language with logical symbols to establish a shared representation, and reshapes training dynamics around phase-transition boundaries to progressively strengthen reasoning at increasing logical depths. Experiments on five benchmarks show that our approach effectively mitigates logical reasoning collapse at high complexity, yielding average accuracy gains of +1.26 in naive prompting and +3.95 in CoT, while improving generalization to unseen logical compositions. Code and data are available at https://github.com/AI4SS/Logical-Phase-Transitions.

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics

cs.SI · 2026-04-08 · unverdicted · novelty 7.0

IntervenSim is an intervention-aware social network simulation that couples source interventions with crowd interactions in a feedback loop, improving MAPE by 41.6% and DTW by 66.9% over prior static frameworks on real-world events.

OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

OmniTrend predicts popularity by combining separate content attractiveness and contextual exposure predictors using cross-modal and exogenous signals.

HotComment: A Benchmark for Evaluating Popularity of Online Comments

cs.AI · 2026-04-28 · unverdicted · novelty 6.0

HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylistic model.

Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction

cs.MM · 2026-04-22 · unverdicted · novelty 5.0

A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.

ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing

cs.SD · 2026-04-13 · unverdicted · novelty 5.0

ActorMind is a four-agent chain-of-thought framework that emulates human actors to produce spontaneous, emotion-infused speech responses for role-playing scenarios.

CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

cs.CV · 2026-04-29 · unverdicted · novelty 4.0

CurEvo integrates curriculum guidance into self-evolution to structure autonomous improvement of video understanding models, yielding gains on VideoQA benchmarks.

citing papers explorer

Showing 6 of 6 citing papers.

IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics cs.SI · 2026-04-08 · unverdicted · none · ref 101 · internal anchor
IntervenSim is an intervention-aware social network simulation that couples source interventions with crowd interactions in a feedback loop, improving MAPE by 41.6% and DTW by 66.9% over prior static frameworks on real-world events.
OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction cs.CV · 2026-04-29 · unverdicted · none · ref 77 · internal anchor
OmniTrend predicts popularity by combining separate content attractiveness and contextual exposure predictors using cross-modal and exogenous signals.
HotComment: A Benchmark for Evaluating Popularity of Online Comments cs.AI · 2026-04-28 · unverdicted · none · ref 95 · internal anchor
HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylistic model.
Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction cs.MM · 2026-04-22 · unverdicted · none · ref 74 · internal anchor
A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.
ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing cs.SD · 2026-04-13 · unverdicted · none · ref 54 · internal anchor
ActorMind is a four-agent chain-of-thought framework that emulates human actors to produce spontaneous, emotion-infused speech responses for role-playing scenarios.
CurEvo: Curriculum-Guided Self-Evolution for Video Understanding cs.CV · 2026-04-29 · unverdicted · none · ref 97 · internal anchor
CurEvo integrates curriculum guidance into self-evolution to structure autonomous improvement of video understanding models, yielding gains on VideoQA benchmarks.

Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer