Improving efficient neural ranking models with cross-architecture knowledge distil- lation.arXiv preprint arXiv:2010.02666

Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, Allan Hanbury · 2021 · arXiv 2010.02666

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

representative citing papers

Prism-Reranker: Beyond Relevance Scoring -- Jointly Producing Contributions and Evidence for Agentic Retrieval

cs.IR · 2026-04-26 · accept · novelty 7.0

Prism-Reranker models output relevance, contribution statements, and evidence passages to support agentic retrieval beyond scalar scoring.

A Unified Model and Document Representation for On-Device Retrieval-Augmented Generation

cs.IR · 2026-04-15 · unverdicted · novelty 7.0

A single model unifies retrieval and context compression for on-device RAG via shared representations, matching traditional RAG performance at 1/10 context size with no extra storage.

Understanding Wacky Weights: A Dissection of SPLADE's Learned Term Importance

cs.IR · 2026-05-19 · conditional · novelty 6.0

SPLADE models produce wacky expansion terms whose prevalence rises with larger vocabularies and falls with stricter sparsity; these terms primarily aid in-domain retrieval rather than out-of-domain generalization.

LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations

cs.IR · 2025-09-16 · conditional · novelty 6.0

LEAF distills teacher-aligned student embedding models that achieve new SOTA results on BEIR and MTEB for their size class while requiring only modest data and compute.

Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages

cs.CL · 2026-05-28 · unverdicted · novelty 5.0

Fine-tuning a Spanish biomedical encoder on Gemini-generated synthetic data for multiple languages yields a bi-encoder that matches or exceeds BioBERT-ST on clinical code retrieval metrics, with further gains from cross-encoder reranking on most languages.

Beyond Hard Negatives: The Importance of Score Distribution in Knowledge Distillation for Dense Retrieval

cs.IR · 2026-04-06 · unverdicted · novelty 5.0

Stratified sampling preserving teacher score distribution outperforms hard-negative mining as a robust baseline for knowledge distillation in dense retrieval.

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

cs.CL · 2026-02-17 · unverdicted · novelty 5.0

A distillation-plus-task-contrastive training regimen yields compact embedding models that match or exceed state-of-the-art performance for their size while supporting 32k-token contexts and quantization.

The Role of Vocabularies in Learning Sparse Representations for Ranking

cs.IR · 2025-09-20 · unverdicted · novelty 5.0

Larger 100K vocabularies in SPLADE models, especially those initialized with ESPLADE pretraining, improve retrieval effectiveness after pruning compared to 32K baselines while keeping similar efficiency.

Search for Coverage: Learning Coverage-Aware Retrieval with Augmented Sub-Question Answerability

cs.IR · 2026-05-27 · unverdicted · novelty 4.0

CoveR improves nugget coverage by 10% over dense baselines in long-form RAG via coverage-aware contrastive training on LLM-generated sub-question signals without losing relevance performance.

Unified Supervision for Walmart's Sponsored Search Retrieval via Joint Semantic Relevance and Behavioral Engagement Modeling

cs.IR · 2026-04-09 · unverdicted · novelty 4.0

A hybrid supervision method for bi-encoder retrievers combines graded relevance from teacher models, production retrieval priors, and selective engagement to improve relevance and NDCG over Walmart's current sponsored search system.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Understanding Wacky Weights: A Dissection of SPLADE's Learned Term Importance cs.IR · 2026-05-19 · conditional · none · ref 14
SPLADE models produce wacky expansion terms whose prevalence rises with larger vocabularies and falls with stricter sparsity; these terms primarily aid in-domain retrieval rather than out-of-domain generalization.
LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations cs.IR · 2025-09-16 · conditional · none · ref 10
LEAF distills teacher-aligned student embedding models that achieve new SOTA results on BEIR and MTEB for their size class while requiring only modest data and compute.

Improving efficient neural ranking models with cross-architecture knowledge distil- lation.arXiv preprint arXiv:2010.02666

fields

years

verdicts

representative citing papers

citing papers explorer