DIN-Retrieval uses domain-invariant neuron representations to retrieve cross-domain demonstrations, achieving an average 1.8-point gain over state-of-the-art methods on mathematical and logical reasoning tasks.
FOLIO: Natural Language Reasoning with First-Order Logic
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
DeltaLogic reveals that models with strong initial logical accuracy often fail to revise beliefs correctly after minimal premise edits, showing inertia even when the gold answer should change.
OPT-BENCH trains LLMs on NP-hard optimization via quality-aware RLVR, achieving 93.1% success rate and 46.6% quality ratio on Qwen2.5-7B while outperforming GPT-4o and transferring gains to other domains.
RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.
Cross-domain demonstrations enable conditional positive transfer in in-context learning beyond an example absorption threshold by repairing reasoning structures instead of relying on semantic overlap.
LogicAgent uses a semiotic-square-guided approach to enhance logical reasoning in LLMs on the new RepublicQA benchmark and others, reporting average gains of 6.25% and 7.05% respectively.
citing papers explorer
-
Towards Effective In-context Cross-domain Knowledge Transfer via Domain-invariant-neurons-based Retrieval
DIN-Retrieval uses domain-invariant neuron representations to retrieve cross-domain demonstrations, achieving an average 1.8-point gain over state-of-the-art methods on mathematical and logical reasoning tasks.
-
DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models
DeltaLogic reveals that models with strong initial logical accuracy often fail to revise beliefs correctly after minimal premise edits, showing inertia even when the gold answer should change.
-
Forge: Quality-Aware Reinforcement Learning for NP-Hard Optimization in LLMs
OPT-BENCH trains LLMs on NP-hard optimization via quality-aware RLVR, achieving 93.1% success rate and 46.6% quality ratio on Qwen2.5-7B while outperforming GPT-4o and transferring gains to other domains.
-
Reasoning with Language Model is Planning with World Model
RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.
-
Reason Analogically via Cross-domain Prior Knowledge: An Empirical Study of Cross-domain Knowledge Transfer for In-Context Learning
Cross-domain demonstrations enable conditional positive transfer in in-context learning beyond an example absorption threshold by repairing reasoning structures instead of relying on semantic overlap.
-
Semantic-Aware Logical Reasoning via a Semiotic Framework
LogicAgent uses a semiotic-square-guided approach to enhance logical reasoning in LLMs on the new RepublicQA benchmark and others, reporting average gains of 6.25% and 7.05% respectively.