PRL-PUTS casts utility-weight tuning as a one-step value-based RL task and uses scalarization-parameter Pareto sweeping at inference time to generate and govern a family of policies, reporting +0.13% lift in successful sessions on Pinterest Homefeed.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Production-Ready RL Framework for Personalized Utility Tuning with Pareto Sweeping in Pinterest Recommender Systems
PRL-PUTS casts utility-weight tuning as a one-step value-based RL task and uses scalarization-parameter Pareto sweeping at inference time to generate and govern a family of policies, reporting +0.13% lift in successful sessions on Pinterest Homefeed.