ExCyTIn-Bench is the first benchmark of 7542 questions from Microsoft Sentinel threat investigation graphs, where the best LLM agent achieves a reward of 0.606.
Expel: Llm agents are experiential learners
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4representative citing papers
SkillsVote is a governance system for agent skills that profiles corpora, recommends via search, and gates updates on successful reusable outcomes, yielding benchmark gains without model changes.
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.
citing papers explorer
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.