ACM Computing Surveys , volume=

A survey on uncertainty quantification of large language models: Taxonomy, open research challenges, future directions , author= · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Estimating LLM Grading Ability and Response Difficulty in Automatic Short Answer Grading via Item Response Theory

cs.CL · 2026-04-30 · unverdicted · novelty 7.0 · 2 refs

Item response theory applied to 17 LLMs on SciEntsBank and Beetle reveals that models with similar overall scores differ sharply in robustness to difficult responses, with errors clustering on partial-credit labels.

ProcCtrlBench: Evaluating Process-Level Defects and Control Preservation in LLM Coding Agents

cs.SE · 2026-05-18 · 2 refs

citing papers explorer

Showing 2 of 2 citing papers.

Estimating LLM Grading Ability and Response Difficulty in Automatic Short Answer Grading via Item Response Theory cs.CL · 2026-04-30 · unverdicted · none · ref 36 · 2 links
Item response theory applied to 17 LLMs on SciEntsBank and Beetle reveals that models with similar overall scores differ sharply in robustness to difficult responses, with errors clustering on partial-credit labels.
ProcCtrlBench: Evaluating Process-Level Defects and Control Preservation in LLM Coding Agents cs.SE · 2026-05-18 · unreviewed · ref 23 · 2 links

ACM Computing Surveys , volume=

fields

years

verdicts

representative citing papers

citing papers explorer