Agentick is a new benchmark for sequential decision-making agents that evaluates RL, LLM, VLM, hybrid, and human approaches across 37 tasks and finds no single method dominates.
AAAI Conference on Artificial Intelligence , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Forager is a lightweight partially-observable continual RL environment that exposes loss of plasticity in current agents and highlights the value of state construction for ongoing learning.
citing papers explorer
-
Agentick: A Unified Benchmark for General Sequential Decision-Making Agents
Agentick is a new benchmark for sequential decision-making agents that evaluates RL, LLM, VLM, hybrid, and human approaches across 37 tasks and finds no single method dominates.
-
Forager: a lightweight testbed for continual learning with partial observability in RL
Forager is a lightweight partially-observable continual RL environment that exposes loss of plasticity in current agents and highlights the value of state construction for ongoing learning.