A multi-dimensional behavioral scoring system using LLM judges evaluates agentic stock predictors and feeds scores into closed-loop RL to improve one-day MAPE by 11.5% on held-out data.
ReAct: Synergizing reasoning and acting in language models,
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
TraceToChain models LLM agent traces as absorbing DTMCs using automatic clustering and smoothed MLE, with KS and AIC validation, to reconcile pass@k, pass^k, and RDC as projections of a single first-passage success-time distribution.
PFAgent automates interactive power-flow analysis by combining intent parsing, tool execution, verification-driven self-evolution, and an evaluation framework, with demonstrations on IEEE benchmark systems.
Nemobot is an LLM-powered platform for creating and refining strategic game agents across dictionary, solvable, heuristic, and learning-based games, moving toward self-programming AI.
citing papers explorer
-
Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback
A multi-dimensional behavioral scoring system using LLM judges evaluates agentic stock predictors and feeds scores into closed-loop RL to improve one-day MAPE by 11.5% on held-out data.
-
Measuring the Unmeasurable: Markov Chain Reliability for LLM Agents
TraceToChain models LLM agent traces as absorbing DTMCs using automatic clustering and smoothed MLE, with KS and AIC validation, to reconcile pass@k, pass^k, and RDC as projections of a single first-passage success-time distribution.
-
PFAgent: A Tractable and Self-Evolving Power-Flow Agent for Interactive Grid Analysis
PFAgent automates interactive power-flow analysis by combining intent parsing, tool execution, verification-driven self-evolution, and an evaluation framework, with demonstrations on IEEE benchmark systems.
-
Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models
Nemobot is an LLM-powered platform for creating and refining strategic game agents across dictionary, solvable, heuristic, and learning-based games, moving toward self-programming AI.