pith. sign in

Title resolution pending

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

verdicts

UNVERDICTED 9

roles

background 1

polarities

background 1

representative citing papers

Verifier-Free RL for LLMs via Intrinsic Gradient-Norm Reward

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

VIGOR assigns higher rewards to LLM completions that produce smaller l2 norms of teacher-forced negative log-likelihood gradients, with sqrt(T) length correction and group ranking, yielding +3.31% math and +1.91% code gains over RLIF on Qwen2.5-7B.

Geometry Guided Self-Consistency for Physical AI

cs.RO · 2026-05-09 · unverdicted · novelty 6.0

KeyStone improves task success rates in diffusion-based physical AI models by up to 13.3% by sampling K trajectories in parallel, clustering them in action space, and returning the medoid of the largest cluster.

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

cs.AI · 2024-08-13 · unverdicted · novelty 6.0

Agent Q integrates MCTS-guided search, self-critique, and off-policy DPO to train LLM agents that outperform behavior cloning and reinforced fine-tuning baselines in WebShop and achieve up to 95.4% success in real-world booking scenarios.

citing papers explorer

Showing 9 of 9 citing papers.