{"paper":{"title":"Skill Neologisms: Towards Skill-based Continual Learning","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Skill neologisms let LLMs gain new abilities by adding optimized soft tokens to the vocabulary without any weight updates.","cross_cats":["cs.AI"],"primary_cat":"cs.LG","authors_text":"Antonin Berthon, Mihaela van der Schaar, Nicolas Astorga","submitted_at":"2026-05-06T14:27:12Z","abstract_excerpt":"Modern LLMs show mastery over an ever-growing range of skills, as well as the ability to compose them flexibly. However, extending model capabilities to new skills in a scalable manner is an open problem: fine-tuning and parameter-efficient variants risk catastrophic forgetting, while context-based approaches have limited expressiveness and are constrained by the model's effective context. We explore skill neologisms--soft tokens integrated in the model's vocabulary and optimized to improve capabilities over a specific skill--as a way to selectively acquire new skills without weight updates. W"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We then show that skill neologisms can be learned to improve model capabilities on specific skills while being composable with out-of-distribution skills, and that independently trained skill neologisms can be composed zero-shot. These results suggest that skill neologisms may provide a scalable path towards skill-based continual learning.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That independently optimized skill neologisms can be added and composed without degrading the base model's performance on unrelated tasks or causing interference between skills.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Skill neologisms are optimized soft tokens that improve LLM performance on targeted skills without weight updates and allow zero-shot composition for continual learning.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Skill neologisms let LLMs gain new abilities by adding optimized soft tokens to the vocabulary without any weight updates.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"565c99aece2e655cf262b764f6dced6bf11e3cac5d72d32e4d6f42152ea236e9"},"source":{"id":"2605.04970","kind":"arxiv","version":2},"verdict":{"id":"9ce6e16c-e00a-4765-b368-62598d83ebb0","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-08T17:22:13.566506Z","strongest_claim":"We then show that skill neologisms can be learned to improve model capabilities on specific skills while being composable with out-of-distribution skills, and that independently trained skill neologisms can be composed zero-shot. These results suggest that skill neologisms may provide a scalable path towards skill-based continual learning.","one_line_summary":"Skill neologisms are optimized soft tokens that improve LLM performance on targeted skills without weight updates and allow zero-shot composition for continual learning.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That independently optimized skill neologisms can be added and composed without degrading the base model's performance on unrelated tasks or causing interference between skills.","pith_extraction_headline":"Skill neologisms let LLMs gain new abilities by adding optimized soft tokens to the vocabulary without any weight updates."},"integrity":{"clean":false,"summary":{"advisory":1,"critical":0,"by_detector":{"doi_compliance":{"total":1,"advisory":1,"critical":0,"informational":0}},"informational":0},"endpoint":"/pith/2605.04970/integrity.json","findings":[{"note":"DOI in the printed bibliography is fragmented by whitespace or line breaks. A longer candidate (10.1016/j.aiopen.2023.08.012.Liu) was visible in the surrounding text but could not be confirmed against doi.org as printed.","detector":"doi_compliance","severity":"advisory","ref_index":9,"audited_at":"2026-05-19T13:59:30.419833Z","detected_doi":"10.1016/j.aiopen.2023.08.012.Liu","finding_type":"recoverable_identifier","verdict_class":"incontrovertible","detected_arxiv_id":null}],"available":true,"detectors_run":[{"name":"doi_title_agreement","ran_at":"2026-05-19T21:31:19.906526Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T13:59:30.419833Z","status":"completed","version":"1.0.0","findings_count":1}],"snapshot_sha256":"7400532a3959b0907a6422654af3b1c2d47062eb5610e3a4934846f7a7f46411"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}