Quality-Diversity through AI Feedback

Andrew Dai; Gr\'egory Schott; Hannah Teufel; Herbie Bradley; Jeff Clune; Jenny Zhang; Joel Lehman; Kenneth Stanley; Koen Oostermeijer; Marco Bellagente

arxiv: 2310.13032 · v4 · pith:UILHQF2Knew · submitted 2023-10-19 · 💻 cs.CL · cs.AI· cs.LG· cs.NE

Quality-Diversity through AI Feedback

Herbie Bradley , Andrew Dai , Hannah Teufel , Jenny Zhang , Koen Oostermeijer , Marco Bellagente , Jeff Clune , Kenneth Stanley

show 2 more authors

Gr\'egory Schott Joel Lehman

This is my paper

classification 💻 cs.CL cs.AIcs.LGcs.NE

keywords searchcreativefeedbackdomainsevaluatehumanqdaifquality-diversity

0 comments

read the original abstract

In many text-generation problems, users may prefer not only a single response, but a diverse range of high-quality outputs from which to choose. Quality-diversity (QD) search algorithms aim at such outcomes, by continually improving and diversifying a population of candidates. However, the applicability of QD to qualitative domains, like creative writing, has been limited by the difficulty of algorithmically specifying measures of quality and diversity. Interestingly, recent developments in language models (LMs) have enabled guiding search through AI feedback, wherein LMs are prompted in natural language to evaluate qualitative aspects of text. Leveraging this development, we introduce Quality-Diversity through AI Feedback (QDAIF), wherein an evolutionary algorithm applies LMs to both generate variation and evaluate the quality and diversity of candidate text. When assessed on creative writing domains, QDAIF covers more of a specified search space with high-quality samples than do non-QD controls. Further, human evaluation of QDAIF-generated creative texts validates reasonable agreement between AI and human evaluation. Our results thus highlight the potential of AI feedback to guide open-ended search for creative and original solutions, providing a recipe that seemingly generalizes to many domains and modalities. In this way, QDAIF is a step towards AI systems that can independently search, diversify, evaluate, and improve, which are among the core skills underlying human society's capacity for innovation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

DEI: Diversity in Evolutionary Inference for Quality-Diversity Search
cs.LG 2026-05 unverdicted novelty 6.0

DEI shows a heterogeneous four-LLM ensemble achieving 124% higher QD-Score and 28% higher coverage than single-model baselines on Core War at equal compute budget.
Unlocking LLM Creativity in Science through Analogical Reasoning
cs.AI 2026-05 conditional novelty 6.0

Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.