Histopathology Multi-modal Embedding for Pathology Composed Retrieval
read the original abstract
To overcome the black-box nature of predictive AI and the hallucination risks of generative models, retrieval-based models offer an interpretable, evidence-based paradigm for pathology clinical workflow. However, real-world clinical queries are inherently interleaved (e.g., pathology images and text). Current dual-encoders suffer from an \textbf{Architectural Mismatch}, lacking the mechanism to fuse such composed queries. To address this, we formalize the task of Pathology Composed Retrieval (PCR). While Multimodal Large Language Models (MLLMs) offer deep-fusion capabilities, directly applying them exposes a \textbf{Task Mismatch} and a \textbf{Domain Mismatch}. To resolve these challenges, we propose HOMIE, a model-agnostic adaptation framework that transforms any generative MLLM into a specialized pathology retrieval expert. Evaluated on our newly introduced PCR Benchmark, a lightweight 2B-parameter HOMIE variant substantially outperforms existing paradigms, surpassing specialized 7B pathology MLLMs and dual-encoders by large margins on composed retrieval, while maintaining strong performance on traditional simple retrieval. The project page is available at https://qfchou.github.io/HOMIE_page/.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Mitigating Batch Effects in Histopathology via Language-Mediated Robust Embedding Generation
GLMP generates robust pathology embeddings by routing histology images through an intermediate textual representation produced by general-purpose MLLMs to mitigate batch effects.
-
PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow
PathoSage is a three-stage framework using Structured Evidence Deliberation and a Beta-Bernoulli experience system to improve patch-level pathology reasoning by mitigating hallucinations and tool conflicts.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.