Prefix gain measured via student-model solve-rate improvement is used to train a Prefix Utility Model (PUM) that supplies stronger supervision than correctness-based process rewards for mathematical reasoning.
T opi OCQA : Open-domain Conversational Question Answering with Topic Switching
6 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 6years
2026 6verdicts
UNVERDICTED 6representative citing papers
MSPA-CQR improves conversational query rewriting by constructing self-consistent preference data across rewriting, retrieval, and response dimensions and training with prefix-guided multi-faceted direct preference optimization, showing effectiveness in both in- and out-of-distribution settings.
LLM-extracted patterns merging logical structures and linguistic cues yield statistically significant gains in fallacy classification over zero-shot baselines with cross-dataset generalization.
MMoA adds LSTM recurrence to Mixture-of-Agents routing, reaching 58.0% win rate on AlpacaEval 2.0 versus 59.8% for baseline MoA while cutting runtime by up to 4.6%.
Fine-tuned PEGASUS achieves state-of-the-art ROUGE scores on XL-Sum English corpus with 4.04% ROUGE-1, 15.25% ROUGE-2, and 3.39% ROUGE-L gains over mT5 baseline.
A multi-turn RAG system combines learned sparse retrieval with LLM-conditioned rewriting, listwise reranking, and generation to handle conversational QA and unanswerable queries across four domains.
citing papers explorer
-
From Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning
Prefix gain measured via student-model solve-rate improvement is used to train a Prefix Utility Model (PUM) that supplies stronger supervision than correctness-based process rewards for mathematical reasoning.
-
Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search
MSPA-CQR improves conversational query rewriting by constructing self-consistent preference data across rewriting, retrieval, and response dimensions and training with prefix-guided multi-faceted direct preference optimization, showing effectiveness in both in- and out-of-distribution settings.
-
Beyond Logical Forms: LLM-Extracted Patterns for Fallacy Classification
LLM-extracted patterns merging logical structures and linguistic cues yield statistically significant gains in fallacy classification over zero-shot baselines with cross-dataset generalization.
-
MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent
MMoA adds LSTM recurrence to Mixture-of-Agents routing, reaching 58.0% win rate on AlpacaEval 2.0 versus 59.8% for baseline MoA while cutting runtime by up to 4.6%.
-
Optimizing Abstractive Summarization With Fine-Tuned PEGASUS
Fine-tuned PEGASUS achieves state-of-the-art ROUGE scores on XL-Sum English corpus with 4.04% ROUGE-1, 15.25% ROUGE-2, and 3.39% ROUGE-L gains over mT5 baseline.
-
uva-irlab-conv at SemEval-2026 Task 8: Multi-Turn RAG with Learned Sparse Retrieval and Listwise Reranking
A multi-turn RAG system combines learned sparse retrieval with LLM-conditioned rewriting, listwise reranking, and generation to handle conversational QA and unanswerable queries across four domains.