Agentic CLEAR automates multi-level evaluation of LLM agents, generating textual insights at system, trace, and node granularity that align with human annotations and predict task success.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
AMARIS augments rubric-based RL with long-term evaluation memory and dual retrieval to update rubrics, outperforming baselines across domains with ~5% overhead.
ODRPO decomposes discrete rewards into ordinal binary indicators to create robust, variance-aware advantage estimators for noisy RLAIF in LLM alignment.
citing papers explorer
-
Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
Agentic CLEAR automates multi-level evaluation of LLM agents, generating textual insights at system, trace, and node granularity that align with human annotations and predict task success.
-
AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
AMARIS augments rubric-based RL with long-term evaluation memory and dual retrieval to update rubrics, outperforming baselines across domains with ~5% overhead.
-
ODRPO: Ordinal Decompositions of Discrete Rewards for Robust Policy Optimization
ODRPO decomposes discrete rewards into ordinal binary indicators to create robust, variance-aware advantage estimators for noisy RLAIF in LLM alignment.