{"paper":{"title":"Unlocking Compositional Generalization in Continual Few-Shot Learning","license":"http://creativecommons.org/licenses/by/4.0/","headline":"By optimizing slot representations for holistic class identity in training and composing them at inference, the framework achieves strong generalization to novel concepts with minimal forgetting.","cross_cats":["cs.CV"],"primary_cat":"cs.LG","authors_text":"Chi-Nguyen Tran, Dao Sy Duy Minh, Huynh Trung Kiet, Long Tran-Thanh, Phu-Hoa Pham, Phu-Quy Nguyen-Lam","submitted_at":"2026-05-12T08:02:31Z","abstract_excerpt":"Object-centric representations promise a key property for few-shot learning: Rather than treating a scene as a single unit, a model can decompose it into individual object-level parts that can be matched and compared across different concepts. In practice, this potential is rarely realized. Continual learners either collapse scenes into global embeddings, or train with part-level matching objectives that tie representations too closely to seen patterns, leaving them unable to generalize to truly novel concepts. In this paper, we identify this fundamental structural conflict and pioneer a new p"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"our framework employs a dual-phase strategy. During training, slot representations are optimized entirely toward holistic class identity, preserving highly generalizable, object-level geometries. At inference, preserved slots are dynamically composed to match novel scenes... achieving state-of-the-art unseen-concept generalization and minimal forgetting across standard continual learning benchmarks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Leveraging the inherent patch-level semantic geometry of self-supervised Vision Transformers (ViTs) that remains generalizable when optimized holistically for class identity rather than tying to seen patterns.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A dual-phase framework using self-supervised ViT slots optimizes representations for class identity during training and composes them dynamically at inference to achieve state-of-the-art generalization to unseen concepts with minimal forgetting in continual few-shot learning.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"By optimizing slot representations for holistic class identity in training and composing them at inference, the framework achieves strong generalization to novel concepts with minimal forgetting.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"a85abe34962b8a3fb7176727d582027800a793add4a83a5afc52d1a22a88efb0"},"source":{"id":"2605.11710","kind":"arxiv","version":2},"verdict":{"id":"4f7f5289-5661-40ec-ab3d-6ac8d15daeed","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-13T07:10:34.349602Z","strongest_claim":"our framework employs a dual-phase strategy. During training, slot representations are optimized entirely toward holistic class identity, preserving highly generalizable, object-level geometries. At inference, preserved slots are dynamically composed to match novel scenes... achieving state-of-the-art unseen-concept generalization and minimal forgetting across standard continual learning benchmarks.","one_line_summary":"A dual-phase framework using self-supervised ViT slots optimizes representations for class identity during training and composes them dynamically at inference to achieve state-of-the-art generalization to unseen concepts with minimal forgetting in continual few-shot learning.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Leveraging the inherent patch-level semantic geometry of self-supervised Vision Transformers (ViTs) that remains generalizable when optimized holistically for class identity rather than tying to seen patterns.","pith_extraction_headline":"By optimizing slot representations for holistic class identity in training and composing them at inference, the framework achieves strong generalization to novel concepts with minimal forgetting."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.11710/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-19T11:39:24.084377Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T09:31:17.241531Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T08:12:42.248301Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"9cfee07db25100eb28d5e86be7cf4109e705fff34a38ed29098139f855d7b02d"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}