Title resolution pending

OpenAI o1 System Card , author= · 2024

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Beyond Meta-Reasoning: Metacognitive Consolidation for Self-Improving LLM Reasoning

cs.AI · 2026-04-19 · unverdicted · novelty 7.0

Metacognitive Consolidation lets LLMs accumulate reusable meta-reasoning skills from past episodes to improve future performance across benchmarks.

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

cs.SE · 2025-02-25 · unverdicted · novelty 7.0

SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.

Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

Macro uses Direct Preference Optimization on composite-scored preference pairs to improve validity of multilingual self-generated counterfactual explanations by 12.55% on average without degrading minimality.

Rotation-Preserving Supervised Fine-Tuning

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

RPSFT improves the in-domain versus out-of-domain performance trade-off during LLM supervised fine-tuning by penalizing rotations in pretrained singular subspaces as a proxy for loss-sensitive directions.

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

cs.AI · 2025-07-01 · conditional · novelty 6.0

Math reasoning gains in LLMs rarely transfer to general domains; RL tuning generalizes while SFT causes forgetting and representation drift.

LIMO: Less is More for Reasoning

cs.CL · 2025-02-05 · unverdicted · novelty 6.0

LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.

Simply Stabilizing the Loop via Fully Looped Transformer

cs.LG · 2026-05-11

citing papers explorer

Showing 7 of 7 citing papers.

Beyond Meta-Reasoning: Metacognitive Consolidation for Self-Improving LLM Reasoning cs.AI · 2026-04-19 · unverdicted · none · ref 18
Metacognitive Consolidation lets LLMs accumulate reusable meta-reasoning skills from past episodes to improve future performance across benchmarks.
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution cs.SE · 2025-02-25 · unverdicted · none · ref 151
SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.
Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization cs.CL · 2026-05-12 · unverdicted · none · ref 54
Macro uses Direct Preference Optimization on composite-scored preference pairs to improve validity of multilingual self-generated counterfactual explanations by 12.55% on average without degrading minimality.
Rotation-Preserving Supervised Fine-Tuning cs.LG · 2026-05-08 · unverdicted · none · ref 122
RPSFT improves the in-domain versus out-of-domain performance trade-off during LLM supervised fine-tuning by penalizing rotations in pretrained singular subspaces as a proxy for loss-sensitive directions.
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning cs.AI · 2025-07-01 · conditional · none · ref 47
Math reasoning gains in LLMs rarely transfer to general domains; RL tuning generalizes while SFT causes forgetting and representation drift.
LIMO: Less is More for Reasoning cs.CL · 2025-02-05 · unverdicted · none · ref 2
LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.
Simply Stabilizing the Loop via Fully Looped Transformer cs.LG · 2026-05-11 · unreviewed · ref 22

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer