Prompt Duel Optimizer uses dueling bandits and LLM-as-judge pairwise feedback with Double Thompson Sampling and top-performer mutation to find stronger prompts than label-free baselines on BBH and MS MARCO under limited comparison budgets.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization
Prompt Duel Optimizer uses dueling bandits and LLM-as-judge pairwise feedback with Double Thompson Sampling and top-performer mutation to find stronger prompts than label-free baselines on BBH and MS MARCO under limited comparison budgets.