CoNL lets LLMs self-improve on non-verifiable tasks by rewarding critiques that produce better solutions in multi-agent conversations, jointly optimizing generation and judging without external feedback.
Find the number of positive integers𝑛 less than 1000 for which there exists a positive real number𝑥such that𝑛=𝑥⌊𝑥⌋
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Conversation for Non-verifiable Learning: Self-Evolving LLMs through Meta-Evaluation
CoNL lets LLMs self-improve on non-verifiable tasks by rewarding critiques that produce better solutions in multi-agent conversations, jointly optimizing generation and judging without external feedback.