11 Hsin-Ling Hsu and Jengnan Tzeng

Michael Günther, Saba Sturua, Mohammad Kalim Akram, Isabelle Mohr, Andrei Ungureanu, Bo Wang, Sedigheh Eslami, Scott Martens, Maximilian Werk, Nan Wang, Han Xiao · 2025 · arXiv 2506.18902

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.

Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval

cs.CV · 2026-04-11 · unverdicted · novelty 7.0

ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.

Spectral Tempering for Embedding Compression in Dense Passage Retrieval

cs.IR · 2026-03-19 · unverdicted · novelty 7.0

Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.

LMEB: Long-horizon Memory Embedding Benchmark

cs.CL · 2026-03-13 · unverdicted · novelty 7.0

LMEB benchmark shows that embedding models' performance on traditional retrieval does not transfer to long-horizon memory tasks, larger models do not always perform better, and LMEB measures capabilities orthogonal to MTEB.

Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

cs.CL · 2025-10-06 · unverdicted · novelty 7.0

GQR is a test-time optimization technique that refines primary retriever query embeddings using complementary retriever scores to achieve high performance with smaller representations in multimodal visual document retrieval.

Your Embedding Model is SMARTer Than You Think

cs.IR · 2026-05-24 · unverdicted · novelty 6.0

SMART unlocks latent multi-vector capabilities in single-vector embedding models by applying late interaction to frozen hidden states shaped by contrastive training, yielding consistent gains on MMEB-V2 and visual document retrieval.

Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding

cs.SD · 2025-12-04 · unverdicted · novelty 6.0

AcuLa aligns audio models with medical language models via contrastive and self-supervised objectives on LLM-generated clinical reports, raising mean AUROC from 0.68 to 0.79 across 18 cardio-respiratory tasks.

DocRetriever: A Plug-and-Play Framework for Multimodal Document Retrieval with Comprehensive Benchmark

cs.CV · 2026-05-28 · unverdicted · novelty 5.0

DocRetriever introduces a framework using layout-aware sparse embeddings for hybrid encoding without OCR and a generalizable reasoning-augmented reranker for few-shot settings, plus the MultiDocR benchmark for evaluation.

Human-Inspired Context-Selective Multimodal Memory for Social Robots

cs.AI · 2026-04-13 · unverdicted · novelty 5.0

A new memory system for social robots selectively stores multimodal memories by emotional salience and novelty, achieving 0.506 Spearman correlation in selectivity and up to 13% better Recall@1 in multimodal retrieval.

Fall into a Pit, Gain in a Wit: Cognitive-Guided Harmful Meme Detection via Misjudgment Risk Pattern Retrieval

cs.LG · 2025-10-10 · unverdicted · novelty 5.0

PatMD improves harmful meme detection by retrieving misjudgment risk patterns to guide MLLMs, reporting 8.30% average F1 and 7.71% accuracy gains on 6,626 memes across 5 tasks.

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

cs.CL · 2026-01-08 · unverdicted · novelty 4.0

Qwen3-VL-Embedding-8B achieves state-of-the-art performance with a 77.8 overall score on the MMEB-V2 multimodal embedding benchmark.

citing papers explorer

Showing 11 of 11 citing papers after filters.

Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval cs.CV · 2026-05-08 · unverdicted · none · ref 17
A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.
Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval cs.CV · 2026-04-11 · unverdicted · none · ref 9
ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.
Spectral Tempering for Embedding Compression in Dense Passage Retrieval cs.IR · 2026-03-19 · unverdicted · none · ref 7
Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.
LMEB: Long-horizon Memory Embedding Benchmark cs.CL · 2026-03-13 · unverdicted · none · ref 14
LMEB benchmark shows that embedding models' performance on traditional retrieval does not transfer to long-horizon memory tasks, larger models do not always perform better, and LMEB measures capabilities orthogonal to MTEB.
Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization cs.CL · 2025-10-06 · unverdicted · none · ref 11
GQR is a test-time optimization technique that refines primary retriever query embeddings using complementary retriever scores to achieve high performance with smaller representations in multimodal visual document retrieval.
Your Embedding Model is SMARTer Than You Think cs.IR · 2026-05-24 · unverdicted · none · ref 4
SMART unlocks latent multi-vector capabilities in single-vector embedding models by applying late interaction to frozen hidden states shaped by contrastive training, yielding consistent gains on MMEB-V2 and visual document retrieval.
Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding cs.SD · 2025-12-04 · unverdicted · none · ref 29
AcuLa aligns audio models with medical language models via contrastive and self-supervised objectives on LLM-generated clinical reports, raising mean AUROC from 0.68 to 0.79 across 18 cardio-respiratory tasks.
DocRetriever: A Plug-and-Play Framework for Multimodal Document Retrieval with Comprehensive Benchmark cs.CV · 2026-05-28 · unverdicted · none · ref 23
DocRetriever introduces a framework using layout-aware sparse embeddings for hybrid encoding without OCR and a generalizable reasoning-augmented reranker for few-shot settings, plus the MultiDocR benchmark for evaluation.
Human-Inspired Context-Selective Multimodal Memory for Social Robots cs.AI · 2026-04-13 · unverdicted · none · ref 11
A new memory system for social robots selectively stores multimodal memories by emotional salience and novelty, achieving 0.506 Spearman correlation in selectivity and up to 13% better Recall@1 in multimodal retrieval.
Fall into a Pit, Gain in a Wit: Cognitive-Guided Harmful Meme Detection via Misjudgment Risk Pattern Retrieval cs.LG · 2025-10-10 · unverdicted · none · ref 15
PatMD improves harmful meme detection by retrieving misjudgment risk patterns to guide MLLMs, reporting 8.30% average F1 and 7.71% accuracy gains on 6,626 memes across 5 tasks.
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking cs.CL · 2026-01-08 · unverdicted · none · ref 8
Qwen3-VL-Embedding-8B achieves state-of-the-art performance with a 77.8 overall score on the MMEB-V2 multimodal embedding benchmark.

11 Hsin-Ling Hsu and Jengnan Tzeng

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer