Let’s verify step by step

Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.AI · 2026-04-22 · unverdicted · novelty 7.0

HiPO improves LLM reasoning performance by optimizing preferences separately on response segments rather than entire outputs.

Showing 1 of 1 citing paper.

HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs cs.AI · 2026-04-22 · unverdicted · none · ref 7
HiPO improves LLM reasoning performance by optimizing preferences separately on response segments rather than entire outputs.