Effective Prompt Pool Learning for Continual Category Discovery
Pith reviewed 2026-05-23 23:22 UTC · model grok-4.3
The pith
Prompt pools conditioned on Gaussian mixtures for global prototypes and on part-level pools for local regions enable label-free continual discovery of new categories from unlabeled data streams.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that representing prompt pools via a Gaussian mixture model over global embeddings, where each component serves as both prototype and conditioning prompt, combined with decomposition into part-level prompt pools for local regions, permits label-free prompt selection, automatic estimation of emerging category counts, and improved discovery accuracy on both generic and fine-grained benchmarks while reducing catastrophic forgetting.
What carries the argument
The Gaussian Mixture Prompt (GMP) module that fits a generative GMM to feature embeddings so each mixture component doubles as class prototype and dynamic prompt, together with the Part-level Prompting (PLP) modules that maintain separate specialized prompt pools for object parts and assign them dynamically to local regions.
If this is right
- Category count rather than sample size is the main performance bottleneck, so finer part-level representations become necessary once category numbers grow.
- Label-free prompt selection and on-the-fly category count estimation become possible through the generative mixture model.
- Dynamic assignment of part-specific prompts to local regions improves discovery on fine-grained data without requiring manual part labels.
- The combined prompt-pool designs reduce catastrophic forgetting of previously discovered categories during the continual stream.
- The frameworks achieve better discovery performance than prior methods on both generic and fine-grained benchmarks.
Where Pith is reading between the lines
- The same mixture-based prompt construction could be tested on non-image modalities if their embeddings admit stable GMM fits.
- The finding that category count dominates sample size suggests experiments that deliberately vary the number of new classes while holding total samples fixed.
- Part-level prompt pools might generalize to other continual tasks that currently rely on global features alone.
- If the GMM fitting step proves sensitive to embedding quality, replacing the backbone or adding embedding regularization would be a direct next test.
Load-bearing premise
That a Gaussian mixture model fitted to unlabeled feature embeddings will produce reliable class prototypes and the correct number of new categories that can be used as effective conditioning prompts for the backbone network.
What would settle it
A test stream with known ground-truth category counts where the number of mixture components selected by the GMP module differs substantially from the true number of new categories, or where adding the PLP modules produces no accuracy gain on a fine-grained discovery benchmark.
Figures
read the original abstract
This paper studies effective prompt pool learning for Continual Category Discovery (CCD), a challenging open-world setting where a model must discover novel categories from a continuous stream of unlabelled data containing both known and novel classes, while mitigating catastrophic forgetting of previously learned concepts. We introduce a series of novel prompt-pool-based frameworks for CCD, each exploring a different design of prompt pools. First, we propose PromptCCD, which focuses on global class prototypes via a Gaussian Mixture Prompt (GMP) module. GMP fits a generative Gaussian mixture model over feature embeddings, where each mixture component serves as both a class prototype and a dynamic prompt that conditions the backbone's representations. This design enables label-free prompt selection and on-the-fly estimation of the number of emerging categories. Through a systematic spectrum study, we then show that category count, rather than sample size, is the primary bottleneck for discovery performance, motivating the need for finer-grained representations. Building on this finding, we propose PromptCCD++, which focuses on object-part prototypes via Part-level Prompting (PLP) modules. PLP decomposes prompt pool into multiple, specialized part-level prompt pools. During discovery phase, these pools dynamically assign part-specific prompts to local object regions without the need for manual part annotations, enabling the model to learn object-part representations that boost category discovery. Extensive evaluations on both generic and fine-grained benchmarks, supported by comprehensive ablation studies, demonstrate the effectiveness of our framework for CCD.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces PromptCCD and PromptCCD++ for Continual Category Discovery (CCD). PromptCCD uses a Gaussian Mixture Prompt (GMP) module that fits a GMM to backbone features so each component acts as both a class prototype and a dynamic conditioning prompt, enabling label-free selection and on-the-fly category-count estimation. PromptCCD++ adds Part-level Prompting (PLP) modules that decompose the prompt pool into specialized part-level pools for dynamic assignment to local regions. The authors report a spectrum study showing category count (not sample size) as the main bottleneck, plus extensive evaluations and ablations on generic and fine-grained benchmarks claiming improved discovery and forgetting mitigation.
Significance. If the empirical claims hold, the work offers a concrete prompt-pool design for open-world continual discovery that avoids manual part labels and uses generative modeling for prototype-based conditioning. The spectrum study on category count versus sample size is a useful diagnostic contribution. The approach builds directly on existing prompt learning without introducing new free parameters beyond standard GMM fitting.
major comments (2)
- [§3.2] §3.2 (GMP module): The central mechanism relies on the EM procedure both selecting the number of components and producing stable prototypes that correctly separate known from novel classes under non-stationary streams. No analysis is provided of component stability, merge/split behavior, or the model-selection criterion when feature distributions of known and novel classes overlap; this directly affects the label-free prompt selection and on-the-fly count estimation claims.
- [§4] §4 (experimental validation): The reported gains in discovery performance and forgetting mitigation rest on the assumption that the fitted GMM components remain reliable across tasks. The manuscript does not include diagnostics (e.g., component assignment accuracy or prototype drift metrics) that would confirm the GMP outputs are not simply fitting noise or merged modes; without such checks the ablation results cannot isolate the contribution of the proposed modules.
minor comments (2)
- [§3.3] The description of how part-specific prompts are assigned to local regions in PLP (without manual annotations) would benefit from an explicit algorithmic step or pseudocode.
- [§3] Notation for the prompt pools and their conditioning on the backbone should be unified across GMP and PLP descriptions to avoid ambiguity in the equations.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the GMP module and experimental validation. We address each major comment below and will revise the manuscript to incorporate the suggested analyses.
read point-by-point responses
-
Referee: [§3.2] §3.2 (GMP module): The central mechanism relies on the EM procedure both selecting the number of components and producing stable prototypes that correctly separate known from novel classes under non-stationary streams. No analysis is provided of component stability, merge/split behavior, or the model-selection criterion when feature distributions of known and novel classes overlap; this directly affects the label-free prompt selection and on-the-fly count estimation claims.
Authors: We agree that further analysis of component stability and behavior under distribution overlap would strengthen the claims. The manuscript currently emphasizes end-to-end discovery performance rather than internal GMM diagnostics. In revision we will add visualizations of component evolution across tasks, quantitative measures of merge/split events (e.g., via component overlap statistics), explicit specification of the model-selection criterion used in EM, and a targeted discussion plus controlled experiments on overlapping known/novel feature distributions to support the label-free selection mechanism. revision: yes
-
Referee: [§4] §4 (experimental validation): The reported gains in discovery performance and forgetting mitigation rest on the assumption that the fitted GMM components remain reliable across tasks. The manuscript does not include diagnostics (e.g., component assignment accuracy or prototype drift metrics) that would confirm the GMP outputs are not simply fitting noise or merged modes; without such checks the ablation results cannot isolate the contribution of the proposed modules.
Authors: We concur that explicit reliability diagnostics would better isolate the GMP contribution from potential noise fitting. The current ablations focus on overall accuracy and forgetting metrics. In the revised manuscript we will include component assignment accuracy evaluated in controlled settings with available ground-truth labels, prototype drift metrics (e.g., average distance between successive prototypes), and additional checks confirming that components capture distinct modes rather than merged noise. These additions will be placed in §4 alongside the existing spectrum study and ablations. revision: yes
Circularity Check
No circularity: method introduces independent design choices without reducing claims to fitted inputs or self-citations
full rationale
The paper proposes PromptCCD and PromptCCD++ as new frameworks built on prompt pools, GMP (GMM fitting for prototypes/prompts), and PLP modules. These are presented as design innovations for CCD, with performance evaluated on benchmarks and ablations. No equations, derivations, or self-citation chains are shown that make any 'prediction' or result equivalent to its inputs by construction. The GMM fitting and prompt assignment are explicit modeling choices, not tautological. The spectrum study on category count is an empirical observation, not a forced outcome. This is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Gaussian mixture models fitted to feature embeddings can serve as reliable class prototypes and dynamic prompts for label-free conditioning and category count estimation.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
GMP fits a generative Gaussian mixture model over feature embeddings, where each mixture component serves as both a class prototype and a dynamic prompt
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
on-the-fly estimation of the number of emerging categories
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Assran, M., Caron, M., Misra, I., Bojanowski, P., Joulin, A., Ballas, N., Rabbat, M.: Semi-supervised learning of visual features by non-parametrically predicting view assignments with support samples. In: ICCV (2021)
work page 2021
-
[2]
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: Mixmatch: A holistic approach to semi-supervised learning. In: NeurIPS (2019)
work page 2019
-
[3]
Boschini,M.,Bonicelli,L.,Buzzega,P.,Porrello,A.,Calderara,S.:Class-incremental continual learning into the extended der-verse. IEEE TPAMI (2022)
work page 2022
-
[4]
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. In: NeurIPS (2020)
work page 2020
-
[5]
Cao, K., Brbić, M., Leskovec, J.: Open-world semi-supervised learning. In: ICLR (2022)
work page 2022
-
[6]
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging properties in self-supervised vision transformers. In: ICCV (2021)
work page 2021
-
[7]
Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press (2006)
work page 2006
-
[8]
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: ICML (2020)
work page 2020
-
[9]
De Lange, M., Aljundi, R., Masana, M., Parisot, S., Jia, X., Leonardis, A., Slabaugh, G., Tuytelaars, T.: A continual learning survey: Defying forgetting in classification tasks. IEEE TPAMI (2021)
work page 2021
-
[10]
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: ICLR (2021)
work page 2021
-
[11]
Fei, Y., Zhao, Z., Yang, S., Zhao, B.: Xcon: Learning with experts for fine-grained category discovery. In: BMVC (2022)
work page 2022
-
[12]
Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE TPAMI (2006)
work page 2006
-
[13]
Fini, E., Sangineto, E., Lathuilière, S., Zhong, Z., Nabi, M., Ricci, E.: A unified objective for novel class discovery. In: ICCV (2021)
work page 2021
-
[14]
Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., Colmenarejo, S.G., Grefenstette, E., Ramalho, T., Agapiou, J., et al.: Hybrid computing using a neural network with dynamic external memory. Nature (2016)
work page 2016
-
[15]
Han, K., Rebuffi, S.A., Ehrhardt, S., Vedaldi, A., Zisserman, A.: Automatically discovering and learning new visual categories with ranking statistics. In: ICLR (2020)
work page 2020
-
[16]
Han, K., Rebuffi, S.A., Ehrhardt, S., Vedaldi, A., Zisserman, A.: Autonovel: Auto- matically discovering and learning novel visual categories. IEEE TPAMI (2021)
work page 2021
-
[17]
Han, K., Vedaldi, A., Zisserman, A.: Learning to discover novel visual categories via deep transfer clustering. In: ICCV (2019)
work page 2019
-
[18]
Hao, S., Han, K., Wong, K.Y.K.: Cipr: An efficient framework with cross-instance positive relations for generalized category discovery. TMLR (2024)
work page 2024
-
[19]
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)
work page 2020
-
[20]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
work page 2016
-
[21]
Huang, J., Fang, C., Chen, W., Chai, Z., Wei, X., Wei, P., Lin, L., Li, G.: Trash to treasure: harvesting ood data with cross-modal matching for open-set semi- supervised learning. In: ICCV (2021)
work page 2021
-
[22]
Jia, X., Han, K., Zhu, Y., Green, B.: Joint representation learning and novel category discovery on single- and multi-modal data. In: ICCV (2021) 16 F. Cendra et al
work page 2021
-
[23]
Joseph, K.J., Paul, S., Aggarwal, G., Biswas, S., Rai, P., Han, K., Balasubramanian, V.N.: Spacing loss for discovering novel categories. In: CVPR Workshop (2022)
work page 2022
-
[24]
Joseph, K., Paul, S., Aggarwal, G., Biswas, S., Rai, P., Han, K., Balasubramanian, V.N.: Novel class discovery without forgetting. In: ECCV (2022)
work page 2022
-
[25]
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. In: NeurIPS (2020)
work page 2020
-
[26]
Kim, H., Suh, S., Kim, D., Jeong, D., Cho, H., Kim, J.: Proxy anchor-based unsupervised learning for continuous generalized category discovery. In: ICCV (2023)
work page 2023
-
[27]
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: ICCV workshop (2013)
work page 2013
-
[28]
Master’s thesis, Department of Computer Science, University of Toronto (2009)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009)
work page 2009
-
[29]
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: ICLR (2017)
work page 2017
-
[30]
Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N (2015)
work page 2015
-
[31]
Li, X., Zhou, Y., Wu, T., Socher, R., Xiong, C.: Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. In: ICML (2019)
work page 2019
- [32]
-
[33]
Liu, M., Roy, S., Zhong, Z., Sebe, N., Ricci, E.: Large-scale pre-trained models are surprisingly strong in incremental novel class discovery. In: ICPR (2024)
work page 2024
-
[34]
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. JMLR (2008)
work page 2008
-
[35]
Fine-Grained Visual Classification of Aircraft
Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 (2013)
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[36]
In: Psychology of learning and motivation
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: The sequential learning problem. In: Psychology of learning and motivation. Elsevier (1989)
work page 1989
-
[37]
Oliver, A., Odena, A., Raffel, C.A., Cubuk, E.D., Goodfellow, I.: Realistic evaluation of deep semi-supervised learning algorithms. In: NeurIPS (2018)
work page 2018
-
[38]
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al.: Dinov2: Learning robust visual features without supervision. TMLR (2023)
work page 2023
-
[39]
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: icarl: Incremental classifier and representation learning. In: CVPR (2017)
work page 2017
-
[40]
Ridnik, T., Ben-Baruch, E., Noy, A., Zelnik-Manor, L.: Imagenet-21k pretraining for the masses. In: NeurIPS (2021)
work page 2021
-
[41]
Rizve, M.N., Duarte, K., Rawat, Y.S., Shah, M.: In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. In: ICLR (2021)
work page 2021
-
[42]
Roy, S., Liu, M., Zhong, Z., Sebe, N., Ricci, E.: Class-incremental novel class discovery. In: ECCV (2022)
work page 2022
-
[43]
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpa- thy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. IJCV (2015)
work page 2015
-
[44]
Saito, K., Kim, D., Saenko, K.: Openmatch: Open-set consistency regularization for semi-supervised learning with outliers. In: NeurIPS (2021)
work page 2021
-
[45]
In: NeurIPS (2020) PromptCCD 17
Sohn, K., Berthelot, D., Li, C.L., Zhang, Z., Carlini, N., Cubuk, E.D., Kurakin, A., Zhang, H., Raffel, C.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In: NeurIPS (2020) PromptCCD 17
work page 2020
-
[46]
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: NeurIPS (2017)
work page 2017
-
[47]
Vaze, S., Han, K., Vedaldi, A., Zisserman, A.: Generalized category discovery. In: CVPR (2022)
work page 2022
-
[48]
California Institute of Technology (2011)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset. California Institute of Technology (2011)
work page 2011
-
[49]
Wang, H., Vaze, S., Han, K.: Sptnet: An efficient alternative framework for general- ized category discovery with spatial prompt tuning. In: ICLR (2024)
work page 2024
-
[50]
Wang, Z., Zhang, Z., Ebrahimi, S., Sun, R., Zhang, H., Lee, C.Y., Ren, X., Su, G., Perot, V., Dy, J., et al.: Dualprompt: Complementary prompting for rehearsal-free continual learning. In: ECCV (2022)
work page 2022
-
[51]
Wang, Z., Zhang, Z., Lee, C.Y., Zhang, H., Sun, R., Ren, X., Su, G., Perot, V., Dy, J., Pfister, T.: Learning to prompt for continual learning. In: CVPR (2022)
work page 2022
-
[52]
Wen, X., Zhao, B., Qi, X.: Parametric classification for generalized category discov- ery: A baseline study. In: ICCV (2023)
work page 2023
-
[53]
Wu, Y., Chi, Z., Wang, Y., Feng, S.: Metagcd: Learning to continually learn in generalized category discovery. In: ICCV (2023)
work page 2023
-
[54]
Yu, Q., Ikami, D., Irie, G., Aizawa, K.: Multi-task curriculum framework for open-set semi-supervised learning. In: ECCV (2020)
work page 2020
-
[55]
Zhang, S., Khan, S., Shen, Z., Naseer, M., Chen, G., Khan, F.S.: Promptcal: Contrastive affinity learning via auxiliary prompts for generalized novel category discovery. In: CVPR (2023)
work page 2023
-
[56]
Zhang, X., Jiang, J., Feng, Y., Wu, Z.F., Zhao, X., Wan, H., Tang, M., Jin, R., Gao, Y.: Grow and merge: A unified framework for continuous categories discovery. In: NeurIPS (2022)
work page 2022
-
[57]
Zhao, B., Han, K.: Novel visual category discovery with dual ranking statistics and mutual knowledge distillation. In: NeurIPS (2021)
work page 2021
-
[58]
Zhao, B., Mac Aodha, O.: Incremental generalized category discovery. In: ICCV (2023)
work page 2023
-
[59]
Zhao, B., Wen, X., Han, K.: Learning semi-supervised gaussian mixture models for generalized category discovery. In: ICCV (2023)
work page 2023
-
[60]
Zhong, Z., Zhu, L., Luo, Z., Li, S., Yang, Y., Sebe, N.: Openmix: Reviving known knowledge for discovering novel visual categories in an open world. In: CVPR (2021) PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery –Supplementary Material– We provide this supplementary material to further support our main paper. We begin wi...
work page 2021
-
[61]
• Table B, comparison on generic datasets with DINOv2
Comparison with known class numbers: • Table A, comparison on generic datasets with DINO. • Table B, comparison on generic datasets with DINOv2. • Table C, comparison on fine-grained datasets with DINO. • Table D, comparison on fine-grained datasets with DINOv2. • Table {E,F,G,H,I},multipleruns( 5 seeds)resultsonvariantsofPromptCCD with different prompt p...
-
[62]
• Table K, comparison using thek-means-based estimator in [47]
Comparison with unknown class numbers, using DINO: • Table J, comparison using our GPC-based-estimator [59]. • Table K, comparison using thek-means-based estimator in [47]. T able A: Breakdown results of various methods for CCD leveraging pretrained DINO model on generic datasets with theknown C in each unlabelled set. Stage 1ACC(%) Stage 2ACC(%) Stage 3A...
-
[63]
in each stage are divided following the percentages in Tab. 2. To further mimic the real-world scenario, which is characterized by an abrupt increase or decrease in the number of classes of each stage, we experiment on another 3 different class splits: (1) 4:2:2:2 – the number of the unseen classes is greater than that of the seen classes; (2) 4:3:2:1 – t...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.