Johnson, Martin San- tillan Cooper, Elizabeth M

Zahra Ashktorab, Michael Desmond, Qian Pan, James M Johnson, Martin Santillan Cooper, Elizabeth M Daly, Rahul Nair, Tejaswini Pedapati, Swapnaja Achintalwar, Werner Geyer · 2024 · arXiv 2410.00873

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Evalet: Evaluating Large Language Models through Functional Fragmentation

cs.HC · 2025-09-14 · conditional · novelty 7.0

Evalet applies functional fragmentation to deliver fragment-level qualitative analysis of LLM evaluations, with a user study showing 48% more misalignment detections than holistic scoring.

MultEval: Supporting Collaborative Alignment for LLM-as-a-Judge Evaluation Criteria

cs.HC · 2026-04-29 · unverdicted · novelty 6.0

MultEval supports collaborative creation of LLM-as-a-judge criteria by surfacing disagreements via consensus-building methods, allowing iterative revisions with examples and history, and keeping transparent how human judgments become automated rules.

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

cs.CL · 2024-12-07 · accept · novelty 3.0

A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

citing papers explorer

Showing 3 of 3 citing papers.

Evalet: Evaluating Large Language Models through Functional Fragmentation cs.HC · 2025-09-14 · conditional · none · ref 7
Evalet applies functional fragmentation to deliver fragment-level qualitative analysis of LLM evaluations, with a user study showing 48% more misalignment detections than holistic scoring.
MultEval: Supporting Collaborative Alignment for LLM-as-a-Judge Evaluation Criteria cs.HC · 2026-04-29 · unverdicted · none · ref 3
MultEval supports collaborative creation of LLM-as-a-judge criteria by surfacing disagreements via consensus-building methods, allowing iterative revisions with examples and history, and keeping transparent how human judgments become automated rules.
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods cs.CL · 2024-12-07 · accept · none · ref 5
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

Johnson, Martin San- tillan Cooper, Elizabeth M

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer