OMNI: Open-endedness via Models of human Notions of Interestingness
read the original abstract
Open-ended algorithms aim to learn new, interesting behaviors forever. That requires a vast environment search space, but there are thus infinitely many possible tasks. Even after filtering for tasks the current agent can learn (i.e., learning progress), countless learnable yet uninteresting tasks remain (e.g., minor variations of previously learned tasks). An Achilles Heel of open-endedness research is the inability to quantify (and thus prioritize) tasks that are not just learnable, but also $\textit{interesting}$ (e.g., worthwhile and novel). We propose solving this problem by $\textit{Open-endedness via Models of human Notions of Interestingness}$ (OMNI). The insight is that we can utilize foundation models (FMs) as a model of interestingness (MoI), because they $\textit{already}$ internalize human concepts of interestingness from training on vast amounts of human-generated data, where humans naturally write about what they find interesting or boring. We show that FM-based MoIs improve open-ended learning by focusing on tasks that are both learnable $\textit{and interesting}$, outperforming baselines based on uniform task sampling or learning progress alone. This approach has the potential to dramatically advance the ability to intelligently select which tasks to focus on next (i.e., auto-curricula), and could be seen as AI selecting its own next task to learn, facilitating self-improving AI and AI-Generating Algorithms. Project website at https://www.jennyzhangzt.com/omni/
This paper has not been read by Pith yet.
Forward citations
Cited by 6 Pith papers
-
Heuresis: Search Strategies for Autonomous AI Research Agents Across Quality, Diversity and Novelty
Heuresis evaluates six search strategies for LLM research agents and shows they steer ideas along quality-diversity-novelty axes but fail to generate novel ideas that match or exceed known high-performing recipes.
-
Heuresis: Search Strategies for Autonomous AI Research Agents Across Quality, Diversity and Novelty
Heuresis evaluates six search strategies for autonomous ML research agents and finds that novel ideas are rare, none rated original, and only one reaches top-10 quality while strategies steer axes but do not expand th...
-
Goal-Conditioned Agents that Learn Everything All at Once
LEO enables efficient all-goals learning in goal-conditioned RL by jointly predicting for all goals in one network pass, yielding >250x speedup over relabelling and better performance on Craftax.
-
When Do We Need LLMs? A Diagnostic for Language-Driven Bandits
Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
-
Language Model Goal Selection Differs from Humans' in a Self-Directed Learning Task
LLMs diverge from human goal selection in self-directed learning by exploiting single solutions with low variability across instances.
-
Interestingness as an Inductive Heuristic for Future Compression Progress
Interestingness is defined as an inductive signal for future compression progress, with proofs that expected progress decays exponentially with time since last breakthrough and that the Algorithmic Prior yields quadra...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.