arXiv preprint arXiv:2409.13884 (2024)

A multi-llm debiasing framework · 2024 · arXiv 2409.13884

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process

cs.CL · 2025-12-29 · unverdicted · novelty 5.0

LLM-PeerReview ensembles LLMs by scoring responses with LLM-as-Judge and selecting the best via averaging or truth inference, beating Smoothie-Global by 6.9-7.3 points on four datasets.

Be a Partner, not a Bystander in Software Engineering Practice: Bridging the Gaps between Academia and Industry

cs.SE · 2026-02-08 · unverdicted · novelty 3.0

Survey evidence shows the software engineering community believes academia must shift from bystander to active partner with industry to boost research impact and relevance.

A Survey of Scaling in Large Language Model Reasoning

cs.AI · 2025-04-02 · unverdicted · novelty 3.0

A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

cs.CL · 2024-12-07 · accept · novelty 3.0

A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

citing papers explorer

Showing 4 of 4 citing papers.

Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process cs.CL · 2025-12-29 · unverdicted · none · ref 39
LLM-PeerReview ensembles LLMs by scoring responses with LLM-as-Judge and selecting the best via averaging or truth inference, beating Smoothie-Global by 6.9-7.3 points on four datasets.
Be a Partner, not a Bystander in Software Engineering Practice: Bridging the Gaps between Academia and Industry cs.SE · 2026-02-08 · unverdicted · none · ref 14
Survey evidence shows the software engineering community believes academia must shift from bystander to active partner with industry to boost research impact and relevance.
A Survey of Scaling in Large Language Model Reasoning cs.AI · 2025-04-02 · unverdicted · none · ref 149
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods cs.CL · 2024-12-07 · accept · none · ref 173
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

arXiv preprint arXiv:2409.13884 (2024)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer