hub

Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neu- big, and Xiang Yue

Zijun Yao, Yantao Liu, Yanxu Chen, Junfeng Chen, Lei Hou, Juanzi Li, Tat-Seng Chua · 2025 · arXiv 2505.23646

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

cs.CL · 2026-04-03 · unverdicted · novelty 7.0

BAS aggregates utility from an answer-or-abstain model across risk thresholds and is uniquely maximized by truthful confidence estimates.

Reasoning Model Is Superior LLM-Judge, Yet Suffers from Biases

cs.CL · 2026-01-07 · unverdicted · novelty 7.0

Reasoning models judge better than non-reasoning LLMs yet retain biases; generating an evaluation plan first mitigates bias without losing accuracy.

Hallucinations Undermine Trust; Metacognition is a Way Forward

cs.CL · 2026-05-02 · unverdicted · novelty 6.0

LLMs need metacognition to align expressed uncertainty with their actual knowledge boundaries, moving beyond knowledge expansion to reduce confident errors.

NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-Neurons

cs.CL · 2026-04-03 · unverdicted · novelty 6.0

NeuReasoner detects neuron fluctuation patterns linked to reasoning failures and inserts special tokens to enable controllable self-correction, delivering up to 27% performance gains and 19-63% lower token use across multiple benchmarks and model sizes.

Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping

cs.LG · 2026-01-24 · unverdicted · novelty 6.0

ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

cs.LG · 2026-04-15 · unverdicted · novelty 5.0

The paper introduces the Proxy Compression Hypothesis as a unifying framework explaining reward hacking in RLHF as an emergent result of compressing high-dimensional human objectives into proxy reward signals under optimization pressure.

FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

FAITH improves LLM factual accuracy by mapping confidence and semantic entropy into natural-language knowledge-state quadrants for trustworthiness and honestness, then applying PPO with a combined reward and retrieval augmentation.

Beyond Stochastic Exploration: What Makes Training Data Valuable for Agentic Search

cs.AI · 2026-04-09 · unverdicted · novelty 5.0

HiExp extracts hierarchical experience knowledge from reasoning trajectories via contrastive analysis and clustering to regularize RL training, turning stochastic exploration into strategic search with reported gains in performance and generalization.

Self-Rewarding Vision-Language Model via Reasoning Decomposition

cs.CV · 2025-08-27 · unverdicted · novelty 5.0

Vision SR1 decomposes VLM reasoning into visual and language components and uses internal self-rewards to improve visual reasoning and reduce hallucinations more efficiently than external-supervision methods.

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

cs.AI · 2025-06-24 · unverdicted · novelty 5.0

KnowRL integrates a knowledge-verification factuality reward into RL training to enforce fact-based reasoning steps and lower hallucination rates in LLMs.

Evaluating Reasoning Models for Queries with Presuppositions

cs.CL · 2026-05-04 · unverdicted · novelty 4.0

Reasoning models achieve only 2-11% higher accuracy than non-reasoning models when handling queries with false presuppositions, failing to challenge 26-42% of them and remaining sensitive to presupposition strength.

Rethinking Agentic Reinforcement Learning In Large Language Models

cs.AI · 2026-04-30 · unverdicted · novelty 3.0 · 3 refs

The paper reviews conceptual foundations, methodological innovations, effective designs, critical challenges, and future directions for LLM-based Agentic Reinforcement Learning.

Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards

cs.LG · 2025-09-26

citing papers explorer

Showing 13 of 13 citing papers.

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence cs.CL · 2026-04-03 · unverdicted · none · ref 53
BAS aggregates utility from an answer-or-abstain model across risk thresholds and is uniquely maximized by truthful confidence estimates.
Reasoning Model Is Superior LLM-Judge, Yet Suffers from Biases cs.CL · 2026-01-07 · unverdicted · none · ref 3
Reasoning models judge better than non-reasoning LLMs yet retain biases; generating an evaluation plan first mitigates bias without losing accuracy.
Hallucinations Undermine Trust; Metacognition is a Way Forward cs.CL · 2026-05-02 · unverdicted · none · ref 48
LLMs need metacognition to align expressed uncertainty with their actual knowledge boundaries, moving beyond knowledge expansion to reduce confident errors.
NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-Neurons cs.CL · 2026-04-03 · unverdicted · none · ref 8
NeuReasoner detects neuron fluctuation patterns linked to reasoning failures and inserts special tokens to enable controllable self-correction, delivering up to 27% performance gains and 19-63% lower token use across multiple benchmarks and model sizes.
Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping cs.LG · 2026-01-24 · unverdicted · none · ref 40
ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges cs.LG · 2026-04-15 · unverdicted · none · ref 138
The paper introduces the Proxy Compression Hypothesis as a unifying framework explaining reward hacking in RLHF as an emergent result of compressing high-dimensional human objectives into proxy reward signals under optimization pressure.
FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness cs.CL · 2026-04-11 · unverdicted · none · ref 6
FAITH improves LLM factual accuracy by mapping confidence and semantic entropy into natural-language knowledge-state quadrants for trustworthiness and honestness, then applying PPO with a combined reward and retrieval augmentation.
Beyond Stochastic Exploration: What Makes Training Data Valuable for Agentic Search cs.AI · 2026-04-09 · unverdicted · none · ref 3
HiExp extracts hierarchical experience knowledge from reasoning trajectories via contrastive analysis and clustering to regularize RL training, turning stochastic exploration into strategic search with reported gains in performance and generalization.
Self-Rewarding Vision-Language Model via Reasoning Decomposition cs.CV · 2025-08-27 · unverdicted · none · ref 24
Vision SR1 decomposes VLM reasoning into visual and language components and uses internal self-rewards to improve visual reasoning and reduce hallucinations more efficiently than external-supervision methods.
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality cs.AI · 2025-06-24 · unverdicted · none · ref 7
KnowRL integrates a knowledge-verification factuality reward into RL training to enforce fact-based reasoning steps and lower hallucination rates in LLMs.
Evaluating Reasoning Models for Queries with Presuppositions cs.CL · 2026-05-04 · unverdicted · none · ref 7
Reasoning models achieve only 2-11% higher accuracy than non-reasoning models when handling queries with false presuppositions, failing to challenge 26-42% of them and remaining sensitive to presupposition strength.
Rethinking Agentic Reinforcement Learning In Large Language Models cs.AI · 2026-04-30 · unverdicted · none · ref 112 · 3 links
The paper reviews conceptual foundations, methodological innovations, effective designs, critical challenges, and future directions for LLM-based Agentic Reinforcement Learning.
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards cs.LG · 2025-09-26 · unreviewed · ref 37

Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neu- big, and Xiang Yue

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer