The first SoK on LLM-as-a-Judge security organizes attacks targeting judges, attacks using judges, defenses leveraging judges, and security-domain applications while flagging vulnerabilities.
Judging the judges: A systematic study of position bias in llm-as-a-judge
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 1polarities
background 1representative citing papers
GPRL carries a k-dimensional skew-symmetric preference structure into policy updates with per-dimension advantages and a drift monitor, yielding 56.51% length-controlled win rate on AlpacaEval 2.0 from Llama-3-8B-Instruct while outperforming SimPO and SPPO on other benchmarks.
Multimodal LLMs exhibit central tendency bias when scoring ordinal clinical images, over-predicting low scores and under-predicting high scores even after prompt ablations.
citing papers explorer
-
Security in LLM-as-a-Judge: A Comprehensive SoK
The first SoK on LLM-as-a-Judge security organizes attacks targeting judges, attacks using judges, defenses leveraging judges, and security-domain applications while flagging vulnerabilities.
-
General Preference Reinforcement Learning
GPRL carries a k-dimensional skew-symmetric preference structure into policy updates with per-dimension advantages and a drift monitor, yielding 56.51% length-controlled win rate on AlpacaEval 2.0 from Llama-3-8B-Instruct while outperforming SimPO and SPPO on other benchmarks.
-
Auditing Multimodal LLM Raters: Central Tendency Bias in Clinical Ordinal Scoring
Multimodal LLMs exhibit central tendency bias when scoring ordinal clinical images, over-predicting low scores and under-predicting high scores even after prompt ablations.