DSAA improves fine-grained open-vocabulary object detection by injecting attribute priors via APA in text embeddings, modulating K/V vectors in BERT, and using an attribute-aware contrastive loss, with gains shown on the FG-OVD benchmark.
You only look once: Unified, real-time object de- tection
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
SignReasoner decomposes traffic signs into functional structure units and uses a two-stage VLM post-training pipeline to achieve state-of-the-art compositional reasoning on a new benchmark.
citing papers explorer
-
DSAA: Dual-Stage Attribute Activation for Fine-grained Open Vocabulary Detection
DSAA improves fine-grained open-vocabulary object detection by injecting attribute priors via APA in text embeddings, modulating K/V vectors in BERT, and using an attribute-aware contrastive loss, with gains shown on the FG-OVD benchmark.
-
SignReasoner: Compositional Reasoning for Complex Traffic Sign Understanding via Functional Structure Units
SignReasoner decomposes traffic signs into functional structure units and uses a two-stage VLM post-training pipeline to achieve state-of-the-art compositional reasoning on a new benchmark.