Using incomplete and incorrect plans to shape reinforcement learning in long-sequence sparse-reward tasks , journal=

· 2025 · DOI 10.1007/s00521-024-10615-2

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Automating Potential-based Reward Shaping with Vision Language Model Guidance

cs.LG · 2026-06-25 · unverdicted · novelty 7.0

VLM-PBRS trains a potential function from small-VLM preferences to enable PBRS in RL, improving sample efficiency in Meta-World and Franka Kitchen without reward hacking.

citing papers explorer

Showing 1 of 1 citing paper.

Automating Potential-based Reward Shaping with Vision Language Model Guidance cs.LG · 2026-06-25 · unverdicted · none · ref 37
VLM-PBRS trains a potential function from small-VLM preferences to enable PBRS in RL, improving sample efficiency in Meta-World and Franka Kitchen without reward hacking.

Using incomplete and incorrect plans to shape reinforcement learning in long-sequence sparse-reward tasks , journal=

fields

years

verdicts

representative citing papers

citing papers explorer