pith. sign in
Pith Number

pith:YF6Q6WFE

pith:2026:YF6Q6WFESKNLMY7FPAWQJZGPWV
not attested not anchored not stored refs pending

Recursive Agent Optimization

Apurva Gandhi, Aviral Kumar, Graham Neubig, Satyaki Chakraborty, Xiangjun Wang

Reinforcement learning trains agents to recursively delegate sub-tasks to copies of themselves for divide-and-conquer scaling.

arxiv:2605.06639 v1 · 2026-05-07 · cs.LG · cs.AI · cs.CL · cs.MA

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{YF6Q6WFESKNLMY7FPAWQJZGPWV}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

recursive agents trained in this way enjoy better training efficiency, can scale to tasks that go beyond the model's context window, generalize to tasks much harder than the ones the agent was trained on, and can enjoy reduced wall-clock time compared to single-agent systems.

C2weakest assumption

That reinforcement learning can reliably teach agents effective delegation and communication rules without introducing prohibitive overhead, infinite recursion risks, or communication failures that undermine the divide-and-conquer benefit.

C3one line summary

RAO uses RL to train recursive agents that delegate sub-tasks to self-copies, yielding better training efficiency, generalization to harder tasks, scaling beyond context windows, and lower wall-clock time.

References

20 extracted · 0 resolved · 0 Pith anchors

[1] def craft(ingredients: dict, target: tuple[str, int]) -> str Craft items using ingredients from your inventory. - ingredients: Dict of item_name: count to consume - target: (item_name, total_count) wh
[2] def get_info(items: list) -> list[dict] Get recipe information for items. - Returns: List with {"item": str, "can_craft": bool, "is_base": bool, "in_inventory": int, "crafting_depth": int, "recipes":
[4] Successfully crafted all required items
[5] def craft(ingredients: dict, target: tuple[str, int]) -> str Craft items using ingredients from your inventory. - ingredients: Dict of {item_name: count} to consume - target: (item_name, total_count)
[6] def get_info(items: list) -> list[dict] Get recipe information for items. - Returns: List with {"item": str, "can_craft": bool, "is_base": bool, "in_inventory": int, "crafting_depth": int, "recipes":
Receipt and verification
First computed 2026-05-18T15:04:06.583108Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

c17d0f58a4929ab663e5782d04e4cfb57d59e3b59f6cb043e21ff81a7d72f48c

Aliases

arxiv: 2605.06639 · arxiv_version: 2605.06639v1 · doi: 10.48550/arxiv.2605.06639 · pith_short_12: YF6Q6WFESKNL · pith_short_16: YF6Q6WFESKNLMY7F · pith_short_8: YF6Q6WFE
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/YF6Q6WFESKNLMY7FPAWQJZGPWV \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c17d0f58a4929ab663e5782d04e4cfb57d59e3b59f6cb043e21ff81a7d72f48c
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "b8fd2ec1394c6f4d7e5f57b9f7439374eeef98be31d1a639ed097d815751a76f",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.MA"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-07T17:49:09Z",
    "title_canon_sha256": "bdb824664eb729b97f9636399a078ea2d309fcee715c7fadf02788e627016269"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.06639",
    "kind": "arxiv",
    "version": 1
  }
}