pith. sign in

arxiv: 2605.29539 · v2 · pith:M3SXF6MEnew · submitted 2026-05-28 · 💻 cs.CV · cs.AI

GiPL: Generative augmented iterative Pseudo-Labeling for Cross-Domain Few-Shot Object Detection

classification 💻 cs.CV cs.AI
keywords giplsupportbranchcd-fsodcdiscovercross-domaindatadetection
0
0 comments X
read the original abstract

Vision-language foundation models have shown promising zero-shot generalization for Cross-Domain Few-Shot Object Detection (CD-FSOD). However, they face two critical challenges in fine-tuning: insufficient support set utilization due to sparse single-instance annotations, and severe overfitting under extremely limited target-domain samples. To address these issues, this paper proposes GiPL, an efficient two-branch training framework. In the first branch, we design an iterative pseudo-label self-training paradigm, which performs zero-shot inference on the support set to generate reliable pseudo-annotations, fuses them with ground-truth labels, and iteratively optimizes the model to fully exploit support set data. In the second branch, we introduce generative data augmentation pipeline using large vision-language models, which synthesizes domain-aligned, multi-object annotated images to enrich training samples and suppress overfitting. Extensive experiments on three challenging CD-FSOD datasets (RUOD, CARPK, CarDD) under 1/5/10-shot settings demonstrate that GiPL consistently outperforms state-of-the-art methods with significant performance gains. Code is available at \href{https://github.com/z-yaz/CDiscover}{CDiscover}.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.