Splate: Sparse late interaction retrieval

16 The Role of Vocabularies in Learning Sparse Representations for Ranking Thibault Formal, St´ ephane Clinchant, Herv´ e D´ ejean, Carlos Lassance · 2021 · arXiv 2104.07186

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Unified and Efficient Approach for Multi-Vector Similarity Search

cs.DB · 2026-04-03 · unverdicted · novelty 7.0

MV-HNSW is the first native hierarchical graph index for multi-vector data, achieving over 90% recall with up to 14x lower search latency than prior filter-and-refine approaches across seven datasets.

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

cs.CL · 2024-02-05 · unverdicted · novelty 7.0

M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.

Are LLM-Based Retrievers Worth Their Cost? An Empirical Study of Efficiency, Robustness, and Reasoning Overhead

cs.IR · 2026-04-04 · accept · novelty 6.0

Empirical comparison across 14 retrievers on the BRIGHT benchmark shows reasoning-specialized models can match strong accuracy with competitive speed while many large LLM bi-encoders add latency for small gains and confidence scores remain poorly calibrated.

The Role of Vocabularies in Learning Sparse Representations for Ranking

cs.IR · 2025-09-20 · unverdicted · novelty 5.0

Larger 100K vocabularies in SPLADE models, especially those initialized with ESPLADE pretraining, improve retrieval effectiveness after pruning compared to 32K baselines while keeping similar efficiency.

citing papers explorer

Showing 4 of 4 citing papers.

Unified and Efficient Approach for Multi-Vector Similarity Search cs.DB · 2026-04-03 · unverdicted · none · ref 7
MV-HNSW is the first native hierarchical graph index for multi-vector data, achieving over 90% recall with up to 14x lower search latency than prior filter-and-refine approaches across seven datasets.
M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation cs.CL · 2024-02-05 · unverdicted · none · ref 39
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
Are LLM-Based Retrievers Worth Their Cost? An Empirical Study of Efficiency, Robustness, and Reasoning Overhead cs.IR · 2026-04-04 · accept · none · ref 24
Empirical comparison across 14 retrievers on the BRIGHT benchmark shows reasoning-specialized models can match strong accuracy with competitive speed while many large LLM bi-encoders add latency for small gains and confidence scores remain poorly calibrated.
The Role of Vocabularies in Learning Sparse Representations for Ranking cs.IR · 2025-09-20 · unverdicted · none · ref 7
Larger 100K vocabularies in SPLADE models, especially those initialized with ESPLADE pretraining, improve retrieval effectiveness after pruning compared to 32K baselines while keeping similar efficiency.

Splate: Sparse late interaction retrieval

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer