PERMA is a new benchmark using temporally ordered events, text variability, and linguistic alignment to evaluate LLM memory agents on persona consistency beyond simple retrieval.
https://arxiv.org/abs/2509.14404
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Introduces a taxonomy of nine LLM code smells, a static detection tool, and reports 73.5% prevalence with 91.3% precision and 71.8% recall across 692 projects.
Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.
citing papers explorer
-
PERMA: Benchmarking Personalized Memory Agents via Event-Driven Preference and Realistic Task Environments
PERMA is a new benchmark using temporally ordered events, text variability, and linguistic alignment to evaluate LLM memory agents on persona consistency beyond simple retrieval.
-
LLM Code Smells: A Taxonomy and Detection Approach
Introduces a taxonomy of nine LLM code smells, a static detection tool, and reports 73.5% prevalence with 91.3% precision and 71.8% recall across 692 projects.
-
Agent Mentor: Framing Agent Knowledge through Semantic Trajectory Analysis
Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.