ARC-AGI-2 adds a larger, more complex set of tasks to the original ARC-AGI benchmark to give finer-grained measurement of fluid intelligence in AI.
Lake, and Todd M
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
Hand-crafted grid descriptors at 50% trajectory completion predict within-task ARC-AGI solver success (AUC 0.885) and transfer across solvers (AUC 0.75).
Frontier AI models' no-CoT 50% task-completion time horizons have doubled yearly over six years, reaching over 3 minutes for GPT-5.5 with projections to 25 minutes by 2030.
citing papers explorer
-
Structural Grid Descriptors Predict Within-Task Solver Success on ARC-AGI
Hand-crafted grid descriptors at 50% trajectory completion predict within-task ARC-AGI solver success (AUC 0.885) and transfer across solvers (AUC 0.75).