If the prompt allows for responses that contain clear logical fallacies but still lead to a correct result, this is considered Fallacy-Oversight Bias

Fallacy-Oversight Bias : Language models may overlook logical fallacies during evaluation

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge

cs.CL · 2024-10-03 · unverdicted · novelty 6.0

LLM-as-a-Judge systems exhibit significant biases in specific tasks despite strong overall performance, as measured by the new CALM quantification framework.

citing papers explorer

Showing 1 of 1 citing paper.

Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge cs.CL · 2024-10-03 · unverdicted · none · ref 23
LLM-as-a-Judge systems exhibit significant biases in specific tasks despite strong overall performance, as measured by the new CALM quantification framework.

If the prompt allows for responses that contain clear logical fallacies but still lead to a correct result, this is considered Fallacy-Oversight Bias

fields

years

verdicts

representative citing papers

citing papers explorer