EIP: Weighted ranking of LLMs by quantifying question difficulty

Xingjian Hu, Ziqian Zhang, Yue Huang, Kai Zhang, Ruoxi Chen, Yixin Liu, Qingsong Wen, Kaidi Xu, Xiangliang Zhang, Neil Zhenqiang Gong, Lichao Sun · 2026

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark Generator

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

RankJudge creates paired multi-turn conversations with isolated single-turn flaws to generate unambiguous benchmarks for LLM-as-a-judge systems across ML, biomedicine, and finance domains.

citing papers explorer

Showing 1 of 1 citing paper.

RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark Generator cs.CL · 2026-05-20 · unverdicted · none · ref 8
RankJudge creates paired multi-turn conversations with isolated single-turn flaws to generate unambiguous benchmarks for LLM-as-a-judge systems across ML, biomedicine, and finance domains.

EIP: Weighted ranking of LLMs by quantifying question difficulty

fields

years

verdicts

representative citing papers

citing papers explorer