Edward J

URL https://arxiv · 2025 · arXiv 2506.18902

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.

Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval

cs.CV · 2026-04-11 · unverdicted · novelty 7.0

ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.

Spectral Tempering for Embedding Compression in Dense Passage Retrieval

cs.IR · 2026-03-19 · unverdicted · novelty 7.0

Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.

LMEB: Long-horizon Memory Embedding Benchmark

cs.CL · 2026-03-13 · unverdicted · novelty 7.0

LMEB benchmark shows that embedding models' performance on traditional retrieval does not transfer to long-horizon memory tasks, larger models do not always perform better, and LMEB measures capabilities orthogonal to MTEB.

Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

cs.CL · 2025-10-06 · unverdicted · novelty 7.0

GQR is a test-time optimization technique that refines primary retriever query embeddings using complementary retriever scores to achieve high performance with smaller representations in multimodal visual document retrieval.

Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding

cs.SD · 2025-12-04 · unverdicted · novelty 6.0

AcuLa aligns audio models with medical language models via contrastive and self-supervised objectives on LLM-generated clinical reports, raising mean AUROC from 0.68 to 0.79 across 18 cardio-respiratory tasks.

Human-Inspired Context-Selective Multimodal Memory for Social Robots

cs.AI · 2026-04-13 · unverdicted · novelty 5.0

A new memory system for social robots selectively stores multimodal memories by emotional salience and novelty, achieving 0.506 Spearman correlation in selectivity and up to 13% better Recall@1 in multimodal retrieval.

Fall into a Pit, Gain in a Wit: Cognitive-Guided Harmful Meme Detection via Misjudgment Risk Pattern Retrieval

cs.LG · 2025-10-10 · unverdicted · novelty 5.0

PatMD improves harmful meme detection by retrieving misjudgment risk patterns to guide MLLMs, reporting 8.30% average F1 and 7.71% accuracy gains on 6,626 memes across 5 tasks.

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

cs.CL · 2026-01-08 · unverdicted · novelty 4.0

Qwen3-VL-Embedding-8B achieves state-of-the-art performance with a 77.8 overall score on the MMEB-V2 multimodal embedding benchmark.

citing papers explorer

Showing 9 of 9 citing papers.

Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval cs.CV · 2026-05-08 · unverdicted · none · ref 17
A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.
Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval cs.CV · 2026-04-11 · unverdicted · none · ref 9
ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.
Spectral Tempering for Embedding Compression in Dense Passage Retrieval cs.IR · 2026-03-19 · unverdicted · none · ref 7
Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.
LMEB: Long-horizon Memory Embedding Benchmark cs.CL · 2026-03-13 · unverdicted · none · ref 14
LMEB benchmark shows that embedding models' performance on traditional retrieval does not transfer to long-horizon memory tasks, larger models do not always perform better, and LMEB measures capabilities orthogonal to MTEB.
Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization cs.CL · 2025-10-06 · unverdicted · none · ref 11
GQR is a test-time optimization technique that refines primary retriever query embeddings using complementary retriever scores to achieve high performance with smaller representations in multimodal visual document retrieval.
Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding cs.SD · 2025-12-04 · unverdicted · none · ref 29
AcuLa aligns audio models with medical language models via contrastive and self-supervised objectives on LLM-generated clinical reports, raising mean AUROC from 0.68 to 0.79 across 18 cardio-respiratory tasks.
Human-Inspired Context-Selective Multimodal Memory for Social Robots cs.AI · 2026-04-13 · unverdicted · none · ref 11
A new memory system for social robots selectively stores multimodal memories by emotional salience and novelty, achieving 0.506 Spearman correlation in selectivity and up to 13% better Recall@1 in multimodal retrieval.
Fall into a Pit, Gain in a Wit: Cognitive-Guided Harmful Meme Detection via Misjudgment Risk Pattern Retrieval cs.LG · 2025-10-10 · unverdicted · none · ref 15
PatMD improves harmful meme detection by retrieving misjudgment risk patterns to guide MLLMs, reporting 8.30% average F1 and 7.71% accuracy gains on 6,626 memes across 5 tasks.
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking cs.CL · 2026-01-08 · unverdicted · none · ref 8
Qwen3-VL-Embedding-8B achieves state-of-the-art performance with a 77.8 overall score on the MMEB-V2 multimodal embedding benchmark.

Edward J

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer