SAGE defines answer-conditioned uncertainty targets from model rollouts to enable GUPO training that aligns verbal uncertainty with distributional behavior.
Agentrouter: A knowledge-graph-guided llm router for collaborative multi-agent question answering
6 Pith papers cite this work. Polarity classification is still indexing.
years
2026 6verdicts
UNVERDICTED 6representative citing papers
Agent systems lose uncertainty at decision handoffs, causing downstream over-trust; the paper proposes latent uncertainty as a carrier to preserve pre-commitment fragility across interfaces.
SeqRoute applies offline RL with CQL and Hindsight Budget Relabeling to sequential LLM routing under global budgets, claiming 6.0-73.5% cost reduction, maintained or improved quality, and under 1% bankruptcy rate.
BoundaryRouter routes queries to LLM or agent using early experience memory from a seed set, cutting inference time 60.6% versus always using agents and raising performance 28.6% versus always using direct LLM inference.
CiteAudit supplies a human-validated benchmark and multi-agent verification system that outperforms existing LLMs and commercial tools at detecting hallucinated scientific references.
Identifies two gaps in entropy-based uncertainty for LLM post-training and proposes GCPO to align geometry-aware disagreement measures with reward-based calibration for better gradient regulation.
citing papers explorer
-
SAGE: Answer-Conditioned Uncertainty Targets for Verbal Uncertainty Alignment
SAGE defines answer-conditioned uncertainty targets from model rollouts to enable GUPO training that aligns verbal uncertainty with distributional behavior.
-
Confidence Laundering in Agent Systems: Why Uncertainty Needs a Latent Carrier
Agent systems lose uncertainty at decision handoffs, causing downstream over-trust; the paper proposes latent uncertainty as a carrier to preserve pre-commitment fragility across interfaces.
-
SeqRoute: Global Budget-Aware Sequential LLM Routing via Offline Reinforcement Learning
SeqRoute applies offline RL with CQL and Hindsight Budget Relabeling to sequential LLM routing under global budgets, claiming 6.0-73.5% cost reduction, maintained or improved quality, and under 1% bankruptcy rate.
-
Learning Agent Routing From Early Experience
BoundaryRouter routes queries to LLM or agent using early experience memory from a seed set, cutting inference time 60.6% versus always using agents and raising performance 28.6% versus always using direct LLM inference.
-
CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era
CiteAudit supplies a human-validated benchmark and multi-agent verification system that outperforms existing LLMs and commercial tools at detecting hallucinated scientific references.
-
Why Semantic Entropy Fails: Geometry-Aware and Calibrated Uncertainty for Policy Optimization
Identifies two gaps in entropy-based uncertainty for LLM post-training and proposes GCPO to align geometry-aware disagreement measures with reward-based calibration for better gradient regulation.