LLMs show strong user bias in role-tagged contexts that is amplified by preference alignment and can be reduced or controlled through targeted fine-tuning and DPO.
hub
Refiner: Reasoning feedback on intermediate representations
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Reflexion lets LLM agents improve via stored verbal reflections on task feedback, reaching 91% pass@1 on HumanEval and outperforming prior GPT-4 results.
M2CL trains per-agent context generators with a self-adaptive mechanism to maintain coherence and reduce output discrepancies in multi-LLM discussions, yielding 20-50% gains on reasoning, embodied, and mobile control tasks.
SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.
Self-RAG trains LLMs to adaptively retrieve passages on demand and self-critique using reflection tokens, outperforming ChatGPT and retrieval-augmented Llama2 on QA, reasoning, and fact verification.
LLMs cannot reliably self-correct reasoning mistakes using only their internal capabilities and often degrade in performance without external feedback.
RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.
A co-evolving proposer-critic RL framework improves GUI grounding accuracy by letting the model critique its own proposals rendered on screenshots.
TRACES tags reasoning steps to enable adaptive early stopping, cutting token use by 20-50% on MATH500, GSM8K, AIME, MMLU and GPQA with comparable accuracy.
Enforcing structured reflection via Outlines-based constrained decoding on an 8B LLM triggers structure snowballing instead of better self-correction, producing near-perfect syntax but persistent semantic errors and revealing an alignment tax.
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.
citing papers explorer
-
User-Assistant Bias in LLMs
LLMs show strong user bias in role-tagged contexts that is amplified by preference alignment and can be reduced or controlled through targeted fine-tuning and DPO.
-
Reflexion: Language Agents with Verbal Reinforcement Learning
Reflexion lets LLM agents improve via stored verbal reflections on task feedback, reaching 91% pass@1 on HumanEval and outperforming prior GPT-4 results.
-
Context Learning for Multi-Agent Discussion
M2CL trains per-agent context generators with a self-adaptive mechanism to maintain coherence and reduce output discrepancies in multi-LLM discussions, yielding 20-50% gains on reasoning, embodied, and mobile control tasks.
-
Training Language Models to Self-Correct via Reinforcement Learning
SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.
-
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Self-RAG trains LLMs to adaptively retrieve passages on demand and self-critique using reflection tokens, outperforming ChatGPT and retrieval-augmented Llama2 on QA, reasoning, and fact verification.
-
Large Language Models Cannot Self-Correct Reasoning Yet
LLMs cannot reliably self-correct reasoning mistakes using only their internal capabilities and often degrade in performance without external feedback.
-
Reasoning with Language Model is Planning with World Model
RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.
-
Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding
A co-evolving proposer-critic RL framework improves GUI grounding accuracy by letting the model critique its own proposals rendered on screenshots.
-
TRACES: Tagging Reasoning Steps for Adaptive Cost-Efficient Early-Stopping
TRACES tags reasoning steps to enable adaptive early stopping, cutting token use by 20-50% on MATH500, GSM8K, AIME, MMLU and GPQA with comparable accuracy.
-
From Hallucination to Structure Snowballing: The Alignment Tax of Constrained Decoding in LLM Reflection
Enforcing structured reflection via Outlines-based constrained decoding on an 8B LLM triggers structure snowballing instead of better self-correction, producing near-perfect syntax but persistent semantic errors and revealing an alignment tax.
-
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.