IdioLink introduces a benchmark dataset and evaluation showing that strong embedding models struggle to retrieve equivalent meanings across idiomatic and literal forms, relying on shallow cues instead.
Linq-embed-mistral technical report.arXiv preprint arXiv:2412.03223, 2024
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 9roles
background 2polarities
background 2representative citing papers
LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,
GQR is a test-time optimization technique that refines primary retriever query embeddings using complementary retriever scores to achieve high performance with smaller representations in multimodal visual document retrieval.
Meta-study of MTEB rankings introduces dataset-composition and ranking-scheme robustness indicators and finds only a small subset of models stay consistently strong across tasks, languages, and evaluation variations.
Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.
KD-Judge structures fitness rules via LLM retrieval and chain-of-thought, then uses pose-guided kinematics for rule-based rep validation with caching for efficient edge deployment, achieving RTF < 1 and speedups up to 15.91x on Jetson.
Proposes High-Precision Scoring (HPS) and Tie-aware Retrieval Metrics (TRM) to reduce tie-induced instability in low-precision retrieval evaluation.
Fine-tuned recurrent models like Mamba2 produce competitive text embeddings with linear-time constant-memory inference via vertical chunking, outperforming transformers in memory use.
Coreference resolution improves retrieval relevance and QA performance in RAG systems, with mean pooling performing best and smaller models benefiting more.
citing papers explorer
-
Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization
GQR is a test-time optimization technique that refines primary retriever query embeddings using complementary retriever scores to achieve high performance with smaller representations in multimodal visual document retrieval.
-
Reliable Evaluation Protocol for Low-Precision Retrieval
Proposes High-Precision Scoring (HPS) and Tie-aware Retrieval Metrics (TRM) to reduce tie-induced instability in low-precision retrieval evaluation.
-
From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems
Coreference resolution improves retrieval relevance and QA performance in RAG systems, with mean pooling performing best and smaller models benefiting more.