Pith Number

pith:3AKINPDP

pith:2025:3AKINPDPEY5E4DLPOLPFFZDOXI

not attested not anchored not stored refs resolved

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

Merouane Debbah, Mohamed Amine Ferrag, Norbert Tihanyi

A review organizes roughly 60 benchmarks for large language models and autonomous agents into one taxonomy covering reasoning, code, and real-world tasks.

arxiv:2504.19678 v2 · 2025-04-28 · cs.AI · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{3AKINPDPEY5E4DLPOLPFFZDOXI}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

we present a side-by-side comparison of benchmarks developed between 2019 and 2025 that evaluate these models and agents across multiple domains... we propose a taxonomy of approximately 60 benchmarks that cover general and academic knowledge reasoning, mathematical problem-solving, code generation and software engineering, factual grounding and retrieval, domain-specific evaluations, multimodal and embodied tasks, task orchestration, and interactive assessments.

C2weakest assumption

the landscape remains fragmented and lacks a unified taxonomy or comprehensive survey, which the authors' proposed taxonomy of approximately 60 benchmarks is assumed to resolve without major omissions or selection bias in the covered works.

C3one line summary

A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.

References

236 extracted · 236 resolved · 43 Pith anchors

[1] OpenAI o1 System Card 2024 · arXiv:2412.16720

[2] Qwen2.5-Omni Technical Report 2025 · arXiv:2503.20215

[3] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 2025 · arXiv:2501.12948

[4] The Llama 3 Herd of Models 2024 · arXiv:2407.21783

[5] Understanding the planning of LLM agents: A survey 2024 · arXiv:2402.02716

Formal links

1 machine-checked theorem link

Cited by

33 papers in Pith

Large Language Model Agent for User-friendly Chemical Process Simulations

Context-Mediated Domain Adaptation in Multi-Agent Sensemaking Systems

Reinforced Preference Optimization for Reasoning-Augmented Recommendations

Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling

STAR: Failure-Aware Markovian Routing for Multi-Agent Spatiotemporal Reasoning

Receipt and verification

First computed	2026-05-17T23:38:53.741627Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

d81486bc6f263a4e0d6f72de52e46eba1eacab61c38fac9bd7279d6350bc4f6b

Aliases

arxiv: 2504.19678 · arxiv_version: 2504.19678v2 · doi: 10.48550/arxiv.2504.19678 · pith_short_12: 3AKINPDPEY5E · pith_short_16: 3AKINPDPEY5E4DLP · pith_short_8: 3AKINPDP

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/3AKINPDPEY5E4DLPOLPFFZDOXI \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d81486bc6f263a4e0d6f72de52e46eba1eacab61c38fac9bd7279d6350bc4f6b

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "fdacaba0a601f052017bef4adecc766750331cb6d24f763c7c1e3ddc86303bda",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2025-04-28T11:08:22Z",
    "title_canon_sha256": "4dfa9e24fcc765bf15db0d4e228dd847b5d56f8dd817458a312a013cd5948841"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2504.19678",
    "kind": "arxiv",
    "version": 2
  }
}