Synthetic data for evaluation: Supporting llm-as-a-judge workflows with evalassist

Martín Santillán Cooper, Zahra Ashktorab, Hyo Jin Do, Erik Miehling, Werner Geyer, Jasmina Gajcin, Elizabeth M Daly, Qian Pan, Michael Desmond · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

NodeSynth: Socially Aligned Synthetic Data for AI Evaluation

cs.LG · 2026-05-14 · unverdicted · novelty 6.0

NodeSynth generates evidence-anchored synthetic queries that trigger up to five times higher failure rates in mainstream LLMs than human-authored benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

NodeSynth: Socially Aligned Synthetic Data for AI Evaluation cs.LG · 2026-05-14 · unverdicted · none · ref 34
NodeSynth generates evidence-anchored synthetic queries that trigger up to five times higher failure rates in mainstream LLMs than human-authored benchmarks.

Synthetic data for evaluation: Supporting llm-as-a-judge workflows with evalassist

fields

years

verdicts

representative citing papers

citing papers explorer