arXiv preprint arXiv:2502.06884 (2025) arXiv:2502.06884 25

Sina Tayebati, Divake Kumar, Nastaran Darabi, Dinithi Jayasuriya, Ranganath Krishnan, Amit Ranjan Trivedi · 2025 · arXiv 2502.06884

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

VLM Judges Can Rank but Cannot Score: Task-Dependent Uncertainty in Multimodal Evaluation

cs.LG · 2026-04-28 · unverdicted · novelty 7.0

VLM judges exhibit task-dependent uncertainty in their scores, with conformal prediction revealing wide intervals for complex tasks and a decoupling between good ranking performance and poor absolute scoring reliability.

From Scalars to Tensors: Declared Losses Recover Epistemic Distinctions That Neutrosophic Scalars Cannot Express

cs.AI · 2026-03-10 · unverdicted · novelty 7.0

Declared losses recover epistemic distinctions collapsed by scalar neutrosophic T/I/F values in LLM evaluations.

Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents

cs.CL · 2026-04-30 · unverdicted · novelty 6.0

RSCB-MC is a risk-sensitive contextual bandit memory controller for LLM coding agents that chooses safe actions including abstention, achieving 60.5% proxy success with 0% false positives and low latency in 200-case validation.

LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy

cs.LG · 2026-05-05

citing papers explorer

Showing 4 of 4 citing papers.

VLM Judges Can Rank but Cannot Score: Task-Dependent Uncertainty in Multimodal Evaluation cs.LG · 2026-04-28 · unverdicted · none · ref 7
VLM judges exhibit task-dependent uncertainty in their scores, with conformal prediction revealing wide intervals for complex tasks and a decoupling between good ranking performance and poor absolute scoring reliability.
From Scalars to Tensors: Declared Losses Recover Epistemic Distinctions That Neutrosophic Scalars Cannot Express cs.AI · 2026-03-10 · unverdicted · none · ref 9
Declared losses recover epistemic distinctions collapsed by scalar neutrosophic T/I/F values in LLM evaluations.
Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents cs.CL · 2026-04-30 · unverdicted · none · ref 25
RSCB-MC is a risk-sensitive contextual bandit memory controller for LLM coding agents that chooses safe actions including abstention, achieving 60.5% proxy success with 0% false positives and low latency in 200-case validation.
LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy cs.LG · 2026-05-05 · unreviewed · ref 12

arXiv preprint arXiv:2502.06884 (2025) arXiv:2502.06884 25

fields

years

verdicts

representative citing papers

citing papers explorer