pith. sign in

arxiv: 2502.07221 · v4 · pith:BCLF7BBEnew · submitted 2025-02-11 · 💻 cs.CV

Histopathology Multi-modal Embedding for Pathology Composed Retrieval

classification 💻 cs.CV
keywords pathologyretrievalcomposedhomiemismatchmodelstextbfclinical
0
0 comments X
read the original abstract

To overcome the black-box nature of predictive AI and the hallucination risks of generative models, retrieval-based models offer an interpretable, evidence-based paradigm for pathology clinical workflow. However, real-world clinical queries are inherently interleaved (e.g., pathology images and text). Current dual-encoders suffer from an \textbf{Architectural Mismatch}, lacking the mechanism to fuse such composed queries. To address this, we formalize the task of Pathology Composed Retrieval (PCR). While Multimodal Large Language Models (MLLMs) offer deep-fusion capabilities, directly applying them exposes a \textbf{Task Mismatch} and a \textbf{Domain Mismatch}. To resolve these challenges, we propose HOMIE, a model-agnostic adaptation framework that transforms any generative MLLM into a specialized pathology retrieval expert. Evaluated on our newly introduced PCR Benchmark, a lightweight 2B-parameter HOMIE variant substantially outperforms existing paradigms, surpassing specialized 7B pathology MLLMs and dual-encoders by large margins on composed retrieval, while maintaining strong performance on traditional simple retrieval. The project page is available at https://qfchou.github.io/HOMIE_page/.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Mitigating Batch Effects in Histopathology via Language-Mediated Robust Embedding Generation

    cs.CV 2026-06 unverdicted novelty 5.0

    GLMP generates robust pathology embeddings by routing histology images through an intermediate textual representation produced by general-purpose MLLMs to mitigate batch effects.

  2. PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow

    cs.AI 2026-05 unverdicted novelty 5.0

    PathoSage is a three-stage framework using Structured Evidence Deliberation and a Beta-Bernoulli experience system to improve patch-level pathology reasoning by mitigating hallucinations and tool conflicts.