A practical guide to multi-objective reinforcement learning and planning,

· 2022

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement

cs.LG · 2024-09-04 · unverdicted · novelty 6.0

ERFSL uses LLMs to create per-requirement reward components, correct their code via a critic, and optimize weights with genetic-algorithm-style mutation and crossover driven by training logs, succeeding in a zero-shot data collection task.

citing papers explorer

Showing 1 of 1 citing paper.

Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement cs.LG · 2024-09-04 · unverdicted · none · ref 1
ERFSL uses LLMs to create per-requirement reward components, correct their code via a critic, and optimize weights with genetic-algorithm-style mutation and crossover driven by training logs, succeeding in a zero-shot data collection task.

A practical guide to multi-objective reinforcement learning and planning,

fields

years

verdicts

representative citing papers

citing papers explorer