pith. sign in

Gdpo: Group reward-decoupled normalization policy optimization for multi-reward rl optimization

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

citation-role summary

baseline 1

citation-polarity summary

fields

cs.CV 1

years

2026 1

verdicts

CONDITIONAL 1

roles

baseline 1

polarities

baseline 1

representative citing papers

Flow-OPD: On-Policy Distillation for Flow Matching Models

cs.CV · 2026-05-08 · conditional · novelty 6.0 · 4 refs

Flow-OPD is a two-stage on-policy distillation method for flow matching models that lifts GenEval from 63 to 92 and OCR from 59 to 94 on SD 3.5 Medium while preserving fidelity.

citing papers explorer

Showing 1 of 1 citing paper.

  • Flow-OPD: On-Policy Distillation for Flow Matching Models cs.CV · 2026-05-08 · conditional · none · ref 33 · 4 links

    Flow-OPD is a two-stage on-policy distillation method for flow matching models that lifts GenEval from 63 to 92 and OCR from 59 to 94 on SD 3.5 Medium while preserving fidelity.