A panel of smaller diverse LLMs outperforms a single large model as an evaluator of generations, showing less intra-model bias and over 7x lower cost.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
A new SLT framework uses latent thoughts as a middle reasoning layer and plan-then-ground decoding to improve coherence and faithfulness in gloss-free sign language translation.
A dual-loop training strategy with gradient consistency lets vision-language models generate radiology reports from low-quality X-ray images without severe performance loss.
EnoTab is a dual denoising framework for TableQA that performs evidence-based question denoising via semantic unit decomposition and evidence tree-guided table pruning with post-order rollback to improve performance on complex questions and large-scale tables.
Position paper warns that model collapse in self-consuming multilingual LLM training loops risks flattening linguistic diversity and cultural nuance.
citing papers explorer
-
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
A panel of smaller diverse LLMs outperforms a single large model as an evaluator of generations, showing less intra-model bias and over 7x lower cost.
-
Think in Latent Thoughts: A New Paradigm for Gloss-Free Sign Language Translation
A new SLT framework uses latent thoughts as a middle reasoning layer and plan-then-ground decoding to improve coherence and faithfulness in gloss-free sign language translation.
-
Radiology Report Generation for Low-Quality X-Ray Images
A dual-loop training strategy with gradient consistency lets vision-language models generate radiology reports from low-quality X-ray images without severe performance loss.
-
When TableQA Meets Noise: A Dual Denoising Framework for Complex Questions and Large-scale Tables
EnoTab is a dual denoising framework for TableQA that performs evidence-based question denoising via semantic unit decomposition and evidence tree-guided table pruning with post-order rollback to improve performance on complex questions and large-scale tables.
-
Losing our Tail, Again: (Un)Natural Selection & Multilingual LLMs
Position paper warns that model collapse in self-consuming multilingual LLM training loops risks flattening linguistic diversity and cultural nuance.