{"state_type":"pith_open_graph_state","state_version":"1.0","pith_number":"pith:2026:WFKUFZ5SZI3EHXDFXVT3S6V45L","merge_version":"pith-open-graph-merge-v1","event_count":2,"valid_event_count":2,"invalid_event_count":0,"equivocation_count":0,"current":{"canonical_record":{"metadata":{"abstract_canon_sha256":"0692e3c7bb8b97307acea2c1132cff40a11070e0fe5fcfa8c4e9a46cdc7ac99b","cross_cats_sorted":["cs.CL"],"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2026-05-11T17:55:13Z","title_canon_sha256":"763b92c85adc2aae4ad8ace05bd5d0aa631e4527c6023334dec339fd4c2c40b2"},"schema_version":"1.0","source":{"id":"2605.10923","kind":"arxiv","version":2}},"source_aliases":[{"alias_kind":"arxiv","alias_value":"2605.10923","created_at":"2026-05-20T00:03:17Z"},{"alias_kind":"arxiv_version","alias_value":"2605.10923v2","created_at":"2026-05-20T00:03:17Z"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2605.10923","created_at":"2026-05-20T00:03:17Z"},{"alias_kind":"pith_short_12","alias_value":"WFKUFZ5SZI3E","created_at":"2026-05-20T00:03:17Z"},{"alias_kind":"pith_short_16","alias_value":"WFKUFZ5SZI3EHXDF","created_at":"2026-05-20T00:03:17Z"},{"alias_kind":"pith_short_8","alias_value":"WFKUFZ5S","created_at":"2026-05-20T00:03:17Z"}],"graph_snapshots":[{"event_id":"sha256:a16888ef7e4f992d6600cd11de0c8ea4cae40a6702fa73f50a6dfa28557f90d2","target":"graph","created_at":"2026-05-20T00:03:17Z","signer":{"key_id":"pith-v1-2026-05","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","signer_id":"pith.science","signer_type":"pith_registry"},"payload":{"graph_snapshot":{"author_claims":{"count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","strong_count":0},"builder_version":"pith-number-builder-2026-05-17-v1","claims":{"count":4,"items":[{"attestation":"unclaimed","claim_id":"C1","kind":"strongest_claim","source":"verdict.strongest_claim","status":"machine_extracted","text":"SLIM treats the active external skill set as a dynamic optimization variable jointly updated with policy learning. Experiments show that SLIM outperforms the best baselines by an average of 7.1% points across ALFWorld and SearchQA. Results further indicate that policy learning and external skill retention are not mutually exclusive."},{"attestation":"unclaimed","claim_id":"C2","kind":"weakest_assumption","source":"verdict.weakest_assumption","status":"machine_extracted","text":"The assumption that the optimal active skill set is non-monotonic, task- and stage-dependent, and that leave-one-skill-out validation can reliably estimate each skill's marginal external contribution without introducing bias or prohibitive cost."},{"attestation":"unclaimed","claim_id":"C3","kind":"one_line_summary","source":"verdict.one_line_summary","status":"machine_extracted","text":"SLIM dynamically optimizes active external skills in agentic RL via leave-one-skill-out marginal contribution estimates and three lifecycle operations, outperforming baselines by 7.1% on ALFWorld and SearchQA while showing some skills are internalized and others remain external."},{"attestation":"unclaimed","claim_id":"C4","kind":"headline","source":"verdict.pith_extraction.headline","status":"machine_extracted","text":"Agentic RL agents improve when the active external skill set is treated as a dynamic optimization variable updated jointly with policy learning."}],"snapshot_sha256":"b379d8823da645bda6714f3a67e3aae02c9add1e48e1ed27cc47a6045b91716d"},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"integrity":{"available":true,"clean":true,"detectors_run":[{"findings_count":0,"name":"ai_meta_artifact","ran_at":"2026-05-19T13:37:04.952737Z","status":"completed","version":"1.0.0"},{"findings_count":0,"name":"doi_title_agreement","ran_at":"2026-05-19T10:31:17.195211Z","status":"completed","version":"1.0.0"},{"findings_count":0,"name":"doi_compliance","ran_at":"2026-05-19T08:53:53.098043Z","status":"completed","version":"1.0.0"}],"endpoint":"/pith/2605.10923/integrity.json","findings":[],"snapshot_sha256":"0f5d711381f42d147aa4514be036fd7ba2ad0b324300157eb19cb1fe214095ab","summary":{"advisory":0,"by_detector":{},"critical":0,"informational":0}},"paper":{"abstract_excerpt":"Large language model agents increasingly rely on external skills to solve complex tasks, where skills act as modular units that extend their capabilities beyond what parametric memory alone supports. Existing methods assume external skills either accumulate as persistent guidance or internalized into the policy, eventually leading to zero-skill inference. We argue this assumption is overly restrictive, since with limited parametric capacity and uneven marginal contribution across skills, the optimal active skill set is non-monotonic, task- and stage-dependent. In this work, we propose SLIM, a ","authors_text":"Hong Cheng, Junhao Shen, Teng Zhang, Xiaoyan Zhao","cross_cats":["cs.CL"],"headline":"Agentic RL agents improve when the active external skill set is treated as a dynamic optimization variable updated jointly with policy learning.","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2026-05-11T17:55:13Z","title":"Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning"},"references":{"count":0,"internal_anchors":0,"resolved_work":0,"sample":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"2605.10923","kind":"arxiv","version":2},"verdict":{"created_at":"2026-05-12T03:41:04.106340Z","id":"26f9c719-bc8f-4aa1-a8b7-106e86970edf","model_set":{"reader":"grok-4.3"},"one_line_summary":"SLIM dynamically optimizes active external skills in agentic RL via leave-one-skill-out marginal contribution estimates and three lifecycle operations, outperforming baselines by 7.1% on ALFWorld and SearchQA while showing some skills are internalized and others remain external.","pipeline_version":"pith-pipeline@v0.9.0","pith_extraction_headline":"Agentic RL agents improve when the active external skill set is treated as a dynamic optimization variable updated jointly with policy learning.","strongest_claim":"SLIM treats the active external skill set as a dynamic optimization variable jointly updated with policy learning. Experiments show that SLIM outperforms the best baselines by an average of 7.1% points across ALFWorld and SearchQA. Results further indicate that policy learning and external skill retention are not mutually exclusive.","weakest_assumption":"The assumption that the optimal active skill set is non-monotonic, task- and stage-dependent, and that leave-one-skill-out validation can reliably estimate each skill's marginal external contribution without introducing bias or prohibitive cost."}},"verdict_id":"26f9c719-bc8f-4aa1-a8b7-106e86970edf"}}],"author_attestations":[],"timestamp_anchors":[],"storage_attestations":[],"citation_signatures":[],"replication_records":[],"corrections":[],"mirror_hints":[],"record_created":{"event_id":"sha256:6425d99cc496046853577ba24e223e64a1a6ef084f5910dc0d145f11f606ab7e","target":"record","created_at":"2026-05-20T00:03:17Z","signer":{"key_id":"pith-v1-2026-05","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","signer_id":"pith.science","signer_type":"pith_registry"},"payload":{"attestation_state":"computed","canonical_record":{"metadata":{"abstract_canon_sha256":"0692e3c7bb8b97307acea2c1132cff40a11070e0fe5fcfa8c4e9a46cdc7ac99b","cross_cats_sorted":["cs.CL"],"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2026-05-11T17:55:13Z","title_canon_sha256":"763b92c85adc2aae4ad8ace05bd5d0aa631e4527c6023334dec339fd4c2c40b2"},"schema_version":"1.0","source":{"id":"2605.10923","kind":"arxiv","version":2}},"canonical_sha256":"b15542e7b2ca3643dc65bd67b97abceadc7fb1324703cc5520728d550027df9c","receipt":{"algorithm":"ed25519","builder_version":"pith-number-builder-2026-05-17-v1","canonical_sha256":"b15542e7b2ca3643dc65bd67b97abceadc7fb1324703cc5520728d550027df9c","first_computed_at":"2026-05-20T00:03:17.205903Z","key_id":"pith-v1-2026-05","kind":"pith_receipt","last_reissued_at":"2026-05-20T00:03:17.205903Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","receipt_version":"0.3","signature_b64":"iZC8wrUvWiVRY1jF/zNKjMIbopmi2EoOdPVQ8ziUlXVB7KJOOfJirjd+OnIRWk+wyN2LhcXhpci8+O/OGTS5BQ==","signature_status":"signed_v1","signed_at":"2026-05-20T00:03:17.206769Z","signed_message":"canonical_sha256_bytes"},"source_id":"2605.10923","source_kind":"arxiv","source_version":2}}},"equivocations":[],"invalid_events":[],"applied_event_ids":["sha256:6425d99cc496046853577ba24e223e64a1a6ef084f5910dc0d145f11f606ab7e","sha256:a16888ef7e4f992d6600cd11de0c8ea4cae40a6702fa73f50a6dfa28557f90d2"],"state_sha256":"dfc379d724d40c7ff07715274305e7ba861aff12319ac37cf0f0fef40cff700e"}