RGSE adapts text embeddings at test time via evolutionary search, using cosine similarity rewards from high-confidence visual proposals to improve open-vocabulary object detection under distribution shifts.
Grounding dino: Marrying dino with grounded pre-training for open-set object detection,
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.
citing papers explorer
-
Reward-Guided Semantic Evolution for Test-time Adaptive Object Detection
RGSE adapts text embeddings at test time via evolutionary search, using cosine similarity rewards from high-confidence visual proposals to improve open-vocabulary object detection under distribution shifts.
-
Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM
Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.