Multimodal fusion of MLLM-generated text embeddings and visual features improves retrieval for forensic tattoo and face matching tasks across images, descriptions, and sketches.
Face photo retrieval by sketch example, in: Proceedings of the 20th ACM International Conference on Multimedia, Association for Computing Machinery, New York, NY, USA
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Bridging the Modality Gap in Forensic Image Retrieval
Multimodal fusion of MLLM-generated text embeddings and visual features improves retrieval for forensic tattoo and face matching tasks across images, descriptions, and sketches.