Evolutionary optimization discovers developmental reward schedules that improve performance over extrinsic-only baselines on some MiniGrid tasks, with novelty emerging as the dominant early signal.
The Goldilocks effect: Human infants allocate attention to visual sequences that are neither too simple nor too complex,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
RATs agents generate and solve their own exploratory tasks during play, distill successful code into a skill library, and reuse it to improve held-out task performance by 20.6 and 17.0 points on two benchmarks.
citing papers explorer
-
Evolutionary Discovery of Developmental Reward Schedules in Deep Reinforcement Learning
Evolutionary optimization discovers developmental reward schedules that improve performance over extrinsic-only baselines on some MiniGrid tasks, with novelty emerging as the dominant early signal.
-
Playful Agentic Robot Learning
RATs agents generate and solve their own exploratory tasks during play, distill successful code into a skill library, and reuse it to improve held-out task performance by 20.6 and 17.0 points on two benchmarks.