Prompting LLMs with test-taking strategies for true/false factuality checks reduces tokens by over 80%, matches strong baselines on two benchmarks with SOTA on one, and enables fine-tuned SLMs to perform similarly at low cost with rationales.
Weizhe Yuan, Graham Neubig, and Pengfei Liu
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Aggregating imperfect factuality metrics into preference data from lexically similar summaries yields consistent factuality gains across model sizes, allowing smaller models to approach larger ones.
citing papers explorer
-
Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies
Prompting LLMs with test-taking strategies for true/false factuality checks reduces tokens by over 80%, matches strong baselines on two benchmarks with SOTA on one, and enables fine-tuned SLMs to perform similarly at low cost with rationales.
-
Optimising Factual Consistency in Summarisation via Preference Learning from Multiple Imperfect Metrics
Aggregating imperfect factuality metrics into preference data from lexically similar summaries yields consistent factuality gains across model sizes, allowing smaller models to approach larger ones.