HKVLM trains only an alignment hook to bind frozen LM query embeddings to frozen detector proposals via contrastive retrieval and bipartite assignment, yielding 50-90x grounding gains and reduced hallucinations on RefCOCO and POPE.
arXiv preprint arXiv:2410.16163 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
HKVLM: Faithful Reasoning Grounding by Binding Language Queries to a Frozen Detector
HKVLM trains only an alignment hook to bind frozen LM query embeddings to frozen detector proposals via contrastive retrieval and bipartite assignment, yielding 50-90x grounding gains and reduced hallucinations on RefCOCO and POPE.