An LLM produces consistent categorical judgments and appropriate confidence declines when evaluating powerline segmentation quality under controlled visual degradations, suggesting it can serve as a reliable watchdog.
Improve LLM-as-a-judge ability as a general ability
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.
citing papers explorer
-
LLM-as-Judge for Semantic Judging of Powerline Segmentation in UAV Inspection
An LLM produces consistent categorical judgments and appropriate confidence declines when evaluating powerline segmentation quality under controlled visual degradations, suggesting it can serve as a reliable watchdog.
-
A Survey on LLM-as-a-Judge
A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.