pith. sign in

Terry , title =

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

citation-role summary

method 3

citation-polarity summary

roles

method 3

polarities

use method 3

representative citing papers

Soft Tournament Equilibrium

cs.AI · 2026-04-06 · unverdicted · novelty 7.0

STE is a differentiable method to compute continuous analogues of the Top Cycle and Uncovered Set from pairwise comparison data for stable set-valued evaluation of cyclic agent interactions.

Understanding Goal Generalisation in Sequential Reinforcement Learning

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

Empirical analysis of over 100 sequential RL training pipelines across 250+ OOD environments finds salient features drive generalization and early goals persist, with latent policy gradients simulating latent variable evolution to predict OOD behavior from training history.

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

cs.AI · 2024-08-13 · unverdicted · novelty 6.0

Agent Q integrates MCTS-guided search, self-critique, and off-policy DPO to train LLM agents that outperform behavior cloning and reinforced fine-tuning baselines in WebShop and achieve up to 95.4% success in real-world booking scenarios.

citing papers explorer

Showing 8 of 8 citing papers.