Dense dual-encoder retrievers outperform BM25 by 9-19% absolute in top-20 passage retrieval accuracy across open-domain QA datasets and enable new state-of-the-art end-to-end QA results.
Proceedings of the 22nd international conference on Machine learning , pages=
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
DPO-RLHF equivalence holds only conditionally on the optimal policy preferring human-preferred responses; otherwise DPO optimizes relative advantage and can prefer worse outputs, addressed by introducing CPO.
ZipRerank delivers state-of-the-art multimodal listwise reranking accuracy for long documents at up to 10x lower latency via early interaction and single-pass scoring.
ReSIDe generalizes logit-based confidence scores to intermediate layers of synthetic image detectors and uses preference optimization to aggregate them, cutting area under the risk-coverage curve by up to 69.55% under covariate shifts.
DMoA is a differentiable multi-agent LLM framework with recurrent context-aware routing and predictive entropy self-supervision that claims SOTA results on 9 benchmarks through elastic agent collaboration.
Reproducibility study diagnoses semantic drift in PO4ISR and introduces PO4ISR++ with reflexive prompting that restores performance with gains up to 54% on Games and 96% on Bundle.
citing papers explorer
-
Dense Passage Retrieval for Open-Domain Question Answering
Dense dual-encoder retrievers outperform BM25 by 9-19% absolute in top-20 passage retrieval accuracy across open-domain QA datasets and enable new state-of-the-art end-to-end QA results.
-
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment
DPO-RLHF equivalence holds only conditionally on the optimal policy preferring human-preferred responses; otherwise DPO optimizes relative advantage and can prefer worse outputs, addressed by introducing CPO.
-
Very Efficient Listwise Multimodal Reranking for Long Documents
ZipRerank delivers state-of-the-art multimodal listwise reranking accuracy for long documents at up to 10x lower latency via early interaction and single-pass scoring.
-
Post-hoc Selective Classification for Reliable Synthetic Image Detection
ReSIDe generalizes logit-based confidence scores to intermediate layers of synthetic image detectors and uses preference optimization to aggregate them, cutting area under the risk-coverage curve by up to 69.55% under covariate shifts.
-
Differentiable Mixture-of-Agents Incentivizes Swarm Intelligence of Large Language Models
DMoA is a differentiable multi-agent LLM framework with recurrent context-aware routing and predictive entropy self-supervision that claims SOTA results on 9 benchmarks through elastic agent collaboration.
-
A Reproducibility Analysis of PO4ISR: Diagnosing and Mitigating Semantic Drift in LLM-Based Session Recommendation
Reproducibility study diagnoses semantic drift in PO4ISR and introduces PO4ISR++ with reflexive prompting that restores performance with gains up to 54% on Games and 96% on Bundle.