ERFSL generates and optimizes LLM-based reward functions for custom multi-objective RL, correcting codes in one iteration and converging weights in 5.2 iterations on average even from 500x errors.
IEEE/CVF Conference on Computer Vision and Pattern Recognition , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.SY 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ERFSL: An Efficient Reward Function Searcher via Language Models for Custom-Environment Multi-Objective Optimization (Student Abstract)
ERFSL generates and optimizes LLM-based reward functions for custom multi-objective RL, correcting codes in one iteration and converging weights in 5.2 iterations on average even from 500x errors.