Title resolution pending

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

LLMs as Assessors: Right for the Right Reason?

cs.IR · 2026-01-13 · unverdicted · novelty 5.0

LLMs judge document relevance at a level comparable to humans but frequently highlight different passages, indicating they are often not right for the right reasons and cannot fully replace human assessors.

REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control

cs.CL · 2025-11-25 · unverdicted · novelty 5.0

REFLEX improves explainable fact-checking by using verdict-anchored style control and self-disagreement signals to disentangle fact from style in LLM outputs, achieving SOTA results with minimal self-refined samples.

A Survey of Scaling in Large Language Model Reasoning

cs.AI · 2025-04-02 · unverdicted · novelty 3.0

A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

citing papers explorer

Showing 3 of 3 citing papers.

LLMs as Assessors: Right for the Right Reason? cs.IR · 2026-01-13 · unverdicted · none · ref 19
LLMs judge document relevance at a level comparable to humans but frequently highlight different passages, indicating they are often not right for the right reasons and cannot fully replace human assessors.
REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control cs.CL · 2025-11-25 · unverdicted · none · ref 31
REFLEX improves explainable fact-checking by using verdict-anchored style control and self-disagreement signals to disentangle fact from style in LLM outputs, achieving SOTA results with minimal self-refined samples.
A Survey of Scaling in Large Language Model Reasoning cs.AI · 2025-04-02 · unverdicted · none · ref 125
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer