KnowRL decomposes RL guidance into atomic knowledge points and uses Constrained Subset Search to build minimal-sufficient subsets, yielding 70.08 average accuracy without hints and 74.16 with them on 1.5B-scale models across eight benchmarks.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance
KnowRL decomposes RL guidance into atomic knowledge points and uses Constrained Subset Search to build minimal-sufficient subsets, yielding 70.08 average accuracy without hints and 74.16 with them on 1.5B-scale models across eight benchmarks.