hub

arXiv preprint arXiv:2409.12183 , year=

Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, Greg Durrett · 2024 · arXiv 2409.12183

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

DB-3DME: From Dataset to Benchmark for Human-aligned Automatic 3D Mesh Evaluation

cs.CV · 2026-06-08 · unverdicted · novelty 6.0

DB-3DME supplies a human-rated 3D mesh dataset and shows that fine-tuning the visual encoder of Qwen-2.5-VL-7B produces automatic evaluations that align better with humans than prior VLMs.

LoRi: Low-Rank Distillation for Implicit Reasoning

cs.CL · 2026-06-03 · unverdicted · novelty 6.0

LoRi distills implicit chain-of-thought by matching low-rank structures in hidden states, raising math-reasoning accuracy toward explicit CoT levels on LLaMA and Qwen models.

CrystalReasoner: Reasoning and RL for Property-Conditioned Crystal Structure Generation

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

CrystalReasoner combines LLM reasoning traces with physical priors and multi-objective RL to generate valid, stable, and property-conditioned crystal structures.

ReactBench: A Benchmark for Topological Reasoning in MLLMs on Chemical Reaction Diagrams

cs.AI · 2026-04-17 · unverdicted · novelty 6.0

ReactBench benchmark shows MLLMs suffer over 30% performance drop on complex topological reasoning tasks versus basic ones when evaluated on chemical reaction diagrams.

HiRO-Nav: Hybrid ReasOning Enables Efficient Embodied Navigation

cs.AI · 2026-04-09 · unverdicted · novelty 6.0

HiRO-Nav adaptively triggers reasoning only on high-entropy actions via a hybrid training pipeline and shows better success-token trade-offs than always-reason or never-reason baselines on the CHORES-S benchmark.

ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs

cs.CR · 2025-04-08 · unverdicted · novelty 6.0

ShadowCoT introduces a reasoning-level backdoor attack on LLMs achieving 94.4% attack success rate and 88.4% hijacking success rate with 0.15% parameter updates via internal state conditioning and reasoning chain pollution.

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

Early entropy dynamics during LLM decoding mark when explicit reasoning becomes beneficial, enabling the training-free EDRM router that selects strategies per instance and yields 41-55% token savings with accuracy gains across 15 benchmarks.

Curriculum Learning-Guided Progressive Distillation in Large Language Models

cs.LG · 2026-05-11 · unverdicted · novelty 5.0

CLPD improves LLM distillation for reasoning by combining explicit data curriculum with progressive teacher scheduling of increasing capacity.

Decodable but Not Corrected by Fixed Residual-Stream Linear Steering: Evidence from Medical LLM Failure Regimes

cs.AI · 2026-05-07 · unverdicted · novelty 5.0

Overthinking in medical QA is linearly decodable at 71.6% accuracy yet fixed residual-stream steering yields no correction across 29 configurations, while enabling selective abstention with AUROC 0.610.

Enhancing Linux Privilege Escalation Attack Capabilities of Local LLM Agents

cs.CR · 2026-04-29 · unverdicted · novelty 5.0

Targeted prompting and system interventions enable local LLMs such as Llama 3.1 70B to exploit 83% of tested Linux privilege escalation vulnerabilities.

Can Textual Reasoning Improve the Performance of MLLMs on Fine-grained Visual Classification?

cs.CV · 2026-01-11 · unverdicted · novelty 5.0

Longer textual reasoning chains degrade MLLM accuracy on fine-grained visual tasks; a new normalization and constrained-reward training framework mitigates the effect and sets new SOTA numbers.

Rethinking Wireless Communications through Formal Mathematical AI Reasoning

eess.SP · 2026-04-28 · unverdicted · novelty 4.0

Proposes a three-layer framework using formal AI reasoning for verification, derivation, and discovery in wireless communications theory.

ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery

cs.CL · 2026-04-10

citing papers explorer

Showing 13 of 13 citing papers.

DB-3DME: From Dataset to Benchmark for Human-aligned Automatic 3D Mesh Evaluation cs.CV · 2026-06-08 · unverdicted · none · ref 31
DB-3DME supplies a human-rated 3D mesh dataset and shows that fine-tuning the visual encoder of Qwen-2.5-VL-7B produces automatic evaluations that align better with humans than prior VLMs.
LoRi: Low-Rank Distillation for Implicit Reasoning cs.CL · 2026-06-03 · unverdicted · none · ref 29
LoRi distills implicit chain-of-thought by matching low-rank structures in hidden states, raising math-reasoning accuracy toward explicit CoT levels on LLaMA and Qwen models.
CrystalReasoner: Reasoning and RL for Property-Conditioned Crystal Structure Generation cs.AI · 2026-05-14 · unverdicted · none · ref 14
CrystalReasoner combines LLM reasoning traces with physical priors and multi-objective RL to generate valid, stable, and property-conditioned crystal structures.
ReactBench: A Benchmark for Topological Reasoning in MLLMs on Chemical Reaction Diagrams cs.AI · 2026-04-17 · unverdicted · none · ref 2
ReactBench benchmark shows MLLMs suffer over 30% performance drop on complex topological reasoning tasks versus basic ones when evaluated on chemical reaction diagrams.
HiRO-Nav: Hybrid ReasOning Enables Efficient Embodied Navigation cs.AI · 2026-04-09 · unverdicted · none · ref 29
HiRO-Nav adaptively triggers reasoning only on high-entropy actions via a hybrid training pipeline and shows better success-token trade-offs than always-reason or never-reason baselines on the CHORES-S benchmark.
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs cs.CR · 2025-04-08 · unverdicted · none · ref 3
ShadowCoT introduces a reasoning-level backdoor attack on LLMs achieving 94.4% attack success rate and 88.4% hijacking success rate with 0.15% parameter updates via internal state conditioning and reasoning chain pollution.
When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions cs.LG · 2026-05-20 · unverdicted · none · ref 2
Early entropy dynamics during LLM decoding mark when explicit reasoning becomes beneficial, enabling the training-free EDRM router that selects strategies per instance and yields 41-55% token savings with accuracy gains across 15 benchmarks.
Curriculum Learning-Guided Progressive Distillation in Large Language Models cs.LG · 2026-05-11 · unverdicted · none · ref 38
CLPD improves LLM distillation for reasoning by combining explicit data curriculum with progressive teacher scheduling of increasing capacity.
Decodable but Not Corrected by Fixed Residual-Stream Linear Steering: Evidence from Medical LLM Failure Regimes cs.AI · 2026-05-07 · unverdicted · none · ref 60
Overthinking in medical QA is linearly decodable at 71.6% accuracy yet fixed residual-stream steering yields no correction across 29 configurations, while enabling selective abstention with AUROC 0.610.
Enhancing Linux Privilege Escalation Attack Capabilities of Local LLM Agents cs.CR · 2026-04-29 · unverdicted · none · ref 25
Targeted prompting and system interventions enable local LLMs such as Llama 3.1 70B to exploit 83% of tested Linux privilege escalation vulnerabilities.
Can Textual Reasoning Improve the Performance of MLLMs on Fine-grained Visual Classification? cs.CV · 2026-01-11 · unverdicted · none · ref 31
Longer textual reasoning chains degrade MLLM accuracy on fine-grained visual tasks; a new normalization and constrained-reward training framework mitigates the effect and sets new SOTA numbers.
Rethinking Wireless Communications through Formal Mathematical AI Reasoning eess.SP · 2026-04-28 · unverdicted · none · ref 77
Proposes a three-layer framework using formal AI reasoning for verification, derivation, and discovery in wireless communications theory.
ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery cs.CL · 2026-04-10 · unreviewed · ref 4

arXiv preprint arXiv:2409.12183 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer