pith. sign in

hub

Terry , title =

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

citation-role summary

method 3

citation-polarity summary

roles

method 3

polarities

use method 3

clear filters

representative citing papers

Soft Tournament Equilibrium

cs.AI · 2026-04-06 · unverdicted · novelty 7.0

STE is a differentiable method to compute continuous analogues of the Top Cycle and Uncovered Set from pairwise comparison data for stable set-valued evaluation of cyclic agent interactions.

Understanding Goal Generalisation in Sequential Reinforcement Learning

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

Empirical analysis of over 100 sequential RL training pipelines across 250+ OOD environments finds salient features drive generalization and early goals persist, with latent policy gradients simulating latent variable evolution to predict OOD behavior from training history.

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

cs.AI · 2024-08-13 · unverdicted · novelty 6.0

Agent Q integrates MCTS-guided search, self-critique, and off-policy DPO to train LLM agents that outperform behavior cloning and reinforced fine-tuning baselines in WebShop and achieve up to 95.4% success in real-world booking scenarios.

Can LLMs Rank? A Tale of Triads and Triage

cs.CY · 2026-06-29 · unverdicted · novelty 5.0

LLM ranking reliability for prioritization tasks can be assessed via coefficient of consistency ζ (intra-run circular triads) and Kendall's τ (inter-run distance), with three leading models showing distinct consistency profiles on homelessness allocation and ED triage.

citing papers explorer

Showing 10 of 10 citing papers after filters.