Investigating selective prediction approaches across several tasks in iid, ood, and adversarial settings

Neeraj Varshney, Swaroop Mishra, Chitta Baral · 2022 · arXiv 2203.00211

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Language Models (Mostly) Know What They Know

cs.CL · 2022-07-11 · unverdicted · novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

cs.AI · 2023-08-10 · accept · novelty 5.0

Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.

citing papers explorer

Showing 2 of 2 citing papers.

Language Models (Mostly) Know What They Know cs.CL · 2022-07-11 · unverdicted · none · ref 8
Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment cs.AI · 2023-08-10 · accept · none · ref 104
Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.

Investigating selective prediction approaches across several tasks in iid, ood, and adversarial settings

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer