Atkins and Edmund H

Ella M · 1997

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Process Reward Models Meet Planning: Generating Precise and Scalable Datasets for Step-Level Rewards

cs.CL · 2026-04-20 · unverdicted · novelty 6.0

PDDL planning problems are used to generate about one million precise reasoning steps for training Process Reward Models, and adding this data to existing datasets improves LLM performance on both mathematical and non-mathematical reasoning benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

Process Reward Models Meet Planning: Generating Precise and Scalable Datasets for Step-Level Rewards cs.CL · 2026-04-20 · unverdicted · none · ref 53
PDDL planning problems are used to generate about one million precise reasoning steps for training Process Reward Models, and adding this data to existing datasets improves LLM performance on both mathematical and non-mathematical reasoning benchmarks.

Atkins and Edmund H

fields

years

verdicts

representative citing papers

citing papers explorer