arXiv preprint arXiv:2406.13408 , year=

· 2024 · arXiv 2406.13408

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

FVRuleLearner: Operator-Level Reasoning Tree (OP-Tree)-Based Rules Learning for Formal Verification

cs.AR · 2026-03-06 · unverdicted · novelty 7.0

FVRuleLearner introduces an Operator Reasoning Tree to learn operator-specific rules that improve natural-language to SystemVerilog assertion generation, raising syntax correctness by 3.95% and functional correctness by 31.17% over baselines.

Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.

Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

citing papers explorer

Showing 3 of 3 citing papers.

FVRuleLearner: Operator-Level Reasoning Tree (OP-Tree)-Based Rules Learning for Formal Verification cs.AR · 2026-03-06 · unverdicted · none · ref 28
FVRuleLearner introduces an Operator Reasoning Tree to learn operator-specific rules that improve natural-language to SystemVerilog assertion generation, raising syntax correctness by 3.95% and functional correctness by 31.17% over baselines.
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning cs.CL · 2026-04-11 · unverdicted · none · ref 203
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs cs.CL · 2026-04-11 · unverdicted · none · ref 218
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

arXiv preprint arXiv:2406.13408 , year=

fields

years

verdicts

representative citing papers

citing papers explorer