{"total":11,"items":[{"citing_arxiv_id":"2606.00610","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation","primary_cat":"cs.IR","submitted_at":"2026-05-30T08:18:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MemGraphRAG uses a memory-based multi-agent system for globally consistent graph construction from fragmented corpora plus a memory-aware hierarchical retriever, claiming better benchmark performance than prior GraphRAG methods at similar cost.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07249","ref_index":41,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MLAIRE: Multilingual Language-Aware Information Retrieval Evaluation Protocal","primary_cat":"cs.IR","submitted_at":"2026-05-08T05:10:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MLAIRE is a protocol that evaluates multilingual retrievers on both semantic accuracy and query-language preference using parallel passages and new metrics like LPR and Lang-nDCG, showing that standard metrics hide distinct behavioral differences among retrievers.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"released publicly available multilingual retrievers at the time of our experiments. DenseOur dense retrievers encompass diverse model lineages, ranging from widely adopted encoder-only families such as multilingual-e5 [ 32], bge-m3 [33], gte [34], snowflake-arctic [35], nomic-embed [36], embeddinggemma [37] and jina [38, 39], to recent LLM-based embedding models including Qwen3-Embedding [40], llama-nemotron [41], and pplx-embed [42]. In this paradigm, queries and passages are independently encoded into fixed-dimensional vectors and scored by cosine similarity. We use each model's prescribed pooling strategy (CLS, mean, or last-token) and follow their recommended instruction templates; representative prefix formats are listed in Appendix C. SparseWe evaluate two multilingual sparse retrievers: a subword lexical baseline and a neu-"},{"citing_arxiv_id":"2605.00353","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Negative Data Mining for Contrastive Learning in Dense Retrieval at IKEA.com","primary_cat":"cs.IR","submitted_at":"2026-05-01T02:32:02+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":3.0,"formal_verification":"none","one_line_summary":"Structured negative mining with taxonomy and LLM judges improves offline category accuracy by 2.6% in IKEA search but yields no significant online engagement gains due to prevalent zero-click user behavior.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.16576","ref_index":51,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability","primary_cat":"cs.IR","submitted_at":"2026-04-17T13:02:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"tax\", exhibiting limited generalizability across broader retrieval conditions. RQ2: How stable are LLM-based dense retrievers under perturbations?We assess stability from two comple- mentary angles: query-side variations and document-side adversarial attacks. For query-side stability, we consider five query variations-misspelling, reordering, synonymizing, paraphrasing, and naturalizing-following prior work [ 51], and measure the relative performance degradation under each type. For document-side stability, we evaluate resilience against corpus poisoning attacks [37, 76] under both white-box and direct-transfer black-box settings, where adversarial passages are injected to mislead the retriever. Our experiments reveal that while LLM-based retrievers demonstrate"},{"citing_arxiv_id":"2604.15484","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"vstash: Local-First Hybrid Retrieval with Adaptive Fusion for LLM Agents","primary_cat":"cs.IR","submitted_at":"2026-04-16T19:22:58+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"vstash shows that hybrid retrieval disagreements provide a free training signal to fine-tune 33M-parameter embeddings, yielding NDCG@10 gains up to 19.5% on NFCorpus and matching some larger models on three of five BEIR datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.11092","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ARHN: Answer-Centric Relabeling of Hard Negatives with Open-Source LLMs for Dense Retrieval","primary_cat":"cs.IR","submitted_at":"2026-04-13T07:11:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ARHN refines hard-negative training data for dense retrieval by using LLMs to convert answer-containing passages into additional positives and exclude answer-containing passages from the negative set.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.08649","ref_index":3,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"PRAGMA: Revolut Foundation Model","primary_cat":"cs.LG","submitted_at":"2026-04-09T18:00:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"PRAGMA pre-trains a Transformer on heterogeneous banking events with a tailored self-supervised masked objective, yielding embeddings that support strong downstream performance on credit scoring, fraud detection, and lifetime value prediction using linear heads or light fine-tuning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.20354","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"EmbeddingGemma: Powerful and Lightweight Text Representations","primary_cat":"cs.CL","submitted_at":"2025-09-24T17:56:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A 300M-parameter open embedding model sets new SOTA on MTEB for its size class and matches models twice as large while staying effective when compressed.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2508.01959","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension","primary_cat":"cs.CL","submitted_at":"2025-08-03T23:59:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SitEmb-v1.5 uses a new training paradigm to produce context-situated embeddings for short chunks, outperforming larger models by over 10% on a curated book-plot retrieval benchmark.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.08480","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Improving Korean-English Cross-Lingual Retrieval: A Data-Centric Study of Language Composition and Model Merging","primary_cat":"cs.IR","submitted_at":"2025-07-11T10:44:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Language composition in training data creates opposing effects on CLIR and mono-IR performance for Korean-English retrieval, which model merging can partially resolve.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2405.17428","ref_index":48,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models","primary_cat":"cs.CL","submitted_at":"2024-05-27T17:59:45+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}