G-Eval: NLG evaluation using GPT-4 with better human alignment,

· 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

A multi-dimensional behavioral scoring system using LLM judges evaluates agentic stock predictors and feeds scores into closed-loop RL to improve one-day MAPE by 11.5% on held-out data.

citing papers explorer

Showing 1 of 1 citing paper.

Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback cs.LG · 2026-05-07 · unverdicted · none · ref 4 · 2 links
A multi-dimensional behavioral scoring system using LLM judges evaluates agentic stock predictors and feeds scores into closed-loop RL to improve one-day MAPE by 11.5% on held-out data.

G-Eval: NLG evaluation using GPT-4 with better human alignment,

fields

years

verdicts

representative citing papers

citing papers explorer