Introducing GPT-5.4 mini and nano

OpenAI · 2026

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR

cs.AI · 2026-05-19 · unverdicted · novelty 6.0

POW3R adapts rubric criterion weights via rollout contrast in RLVR to improve mean reward, strict completion rates, and training speed over static rubric aggregation on multimodal and text tasks.

citing papers explorer

Showing 1 of 1 citing paper.

Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR cs.AI · 2026-05-19 · unverdicted · none · ref 16
POW3R adapts rubric criterion weights via rollout contrast in RLVR to improve mean reward, strict completion rates, and training speed over static rubric aggregation on multimodal and text tasks.

Introducing GPT-5.4 mini and nano

fields

years

verdicts

representative citing papers

citing papers explorer