{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:PPVKZDEHDDS4UEBGUQD3IP2TPF","short_pith_number":"pith:PPVKZDEH","schema_version":"1.0","canonical_sha256":"7beaac8c8718e5ca1026a407b43f53795a72a8ee1a859b555d1376ab95508428","source":{"kind":"arxiv","id":"2605.14212","version":1},"attestation_state":"computed","paper":{"title":"MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"MetaAgent-X jointly trains the designer and executors of automatic multi-agent systems using end-to-end reinforcement learning.","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Huazheng Wang, Jiayu Chang, Jishen Zhao, Nan Wang, Qingyun Wu, Yaolun Zhang, Yiran Wu, Yizhao Chen, Yujie Zhao","submitted_at":"2026-05-14T00:11:27Z","abstract_excerpt":"Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automati"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2605.14212","kind":"arxiv","version":1},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.AI","submitted_at":"2026-05-14T00:11:27Z","cross_cats_sorted":[],"title_canon_sha256":"2a3ceed12129c862013a513d84303942bd8ca228ef545f36479cf01b494d748f","abstract_canon_sha256":"90b5dd0ee5f8c0671ad18c3c6869262239d1d5d063f971c2ea82535446863831"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:39:10.919510Z","signature_b64":"HK+mz49jBjAmPmiE4/+2lxXeLalHw7bfsBjsUS7dflYZxX+mJRWix2uZ0HcIqpjZrgD+3FsMARL6fX6QOAFPAw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"7beaac8c8718e5ca1026a407b43f53795a72a8ee1a859b555d1376ab95508428","last_reissued_at":"2026-05-17T23:39:10.919082Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:39:10.919082Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"MetaAgent-X jointly trains the designer and executors of automatic multi-agent systems using end-to-end reinforcement learning.","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Huazheng Wang, Jiayu Chang, Jishen Zhao, Nan Wang, Qingyun Wu, Yaolun Zhang, Yiran Wu, Yizhao Chen, Yujie Zhao","submitted_at":"2026-05-14T00:11:27Z","abstract_excerpt":"Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automati"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. ... These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That Executor Designer Hierarchical Rollout and Stagewise Co-evolution provide stable joint optimization and accurate credit assignment across designer and executor trajectories without introducing new instabilities or biases that would prevent both components from improving.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"MetaAgent-X uses end-to-end RL to jointly optimize automatic multi-agent system design and execution, outperforming baselines by up to 21.7% through hierarchical rollouts and stagewise co-evolution.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"MetaAgent-X jointly trains the designer and executors of automatic multi-agent systems using end-to-end reinforcement learning.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"76b7e384b93f58cdc485ece5cef168a0be915c86185b3c43061d562b113e25f2"},"source":{"id":"2605.14212","kind":"arxiv","version":1},"verdict":{"id":"3bde2c6f-5fdf-4a2f-b951-47ddd30f87c8","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T02:45:37.586097Z","strongest_claim":"MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. ... These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.","one_line_summary":"MetaAgent-X uses end-to-end RL to jointly optimize automatic multi-agent system design and execution, outperforming baselines by up to 21.7% through hierarchical rollouts and stagewise co-evolution.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That Executor Designer Hierarchical Rollout and Stagewise Co-evolution provide stable joint optimization and accurate credit assignment across designer and executor trajectories without introducing new instabilities or biases that would prevent both components from improving.","pith_extraction_headline":"MetaAgent-X jointly trains the designer and executors of automatic multi-agent systems using end-to-end reinforcement learning."},"references":{"count":52,"sample":[{"doi":"","year":2024,"title":"Yujie Zhao, Hejia Zhang, Hanxian Huang, Zhongming Yu, and Jishen Zhao","work_id":"e6d87867-de15-4827-b740-f4aa603f708a","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"14 Figure 4: Sensitivity analysis on the stage length for designer–executor alternation","work_id":"3590c28c-4ac0-41c9-b839-a869f25bb253","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Delivery formatting.Inter-agent messages must be strictly enclosed within <delivery>...</delivery> tags. This constraint serves a dual purpose: it establishes a structured, easily parsable communicati","work_id":"cadd2d18-128e-43b1-918c-7001c4cce933","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Compute total number of ways to choose 4 numbers from 10:C(10,4)","work_id":"2aa90731-24bf-47c6-9ffd-29d736a4d47c","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"ComputeP(grand prize): number of ways to match all 4 numbers","work_id":"df92995a-93fa-4a13-8312-47293fb49251","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":52,"snapshot_sha256":"d34860785c889846c0b0a127339043d0ffd501cd70baa0e7b2bc9fd0e614884a","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"8db17f618145b6f3b5bbc4d3213a0fe744253cf312ed4f399402b985c038bd42"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2605.14212","created_at":"2026-05-17T23:39:10.919145+00:00"},{"alias_kind":"arxiv_version","alias_value":"2605.14212v1","created_at":"2026-05-17T23:39:10.919145+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2605.14212","created_at":"2026-05-17T23:39:10.919145+00:00"},{"alias_kind":"pith_short_12","alias_value":"PPVKZDEHDDS4","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"PPVKZDEHDDS4UEBG","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"PPVKZDEH","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":0,"internal_anchor_count":0,"sample":[]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF","json":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF.json","graph_json":"https://pith.science/api/pith-number/PPVKZDEHDDS4UEBGUQD3IP2TPF/graph.json","events_json":"https://pith.science/api/pith-number/PPVKZDEHDDS4UEBGUQD3IP2TPF/events.json","paper":"https://pith.science/paper/PPVKZDEH"},"agent_actions":{"view_html":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF","download_json":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF.json","view_paper":"https://pith.science/paper/PPVKZDEH","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2605.14212&json=true","fetch_graph":"https://pith.science/api/pith-number/PPVKZDEHDDS4UEBGUQD3IP2TPF/graph.json","fetch_events":"https://pith.science/api/pith-number/PPVKZDEHDDS4UEBGUQD3IP2TPF/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF/action/timestamp_anchor","attest_storage":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF/action/storage_attestation","attest_author":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF/action/author_attestation","sign_citation":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF/action/citation_signature","submit_replication":"https://pith.science/pith/PPVKZDEHDDS4UEBGUQD3IP2TPF/action/replication_record"}},"created_at":"2026-05-17T23:39:10.919145+00:00","updated_at":"2026-05-17T23:39:10.919145+00:00"}