LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4representative citing papers
Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.
SPLADE models produce wacky expansion terms whose prevalence rises with larger vocabularies and falls with stricter sparsity; these terms primarily aid in-domain retrieval rather than out-of-domain generalization.
citing papers explorer
-
On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability
LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,
-
Spectral Tempering for Embedding Compression in Dense Passage Retrieval
Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.
-
Understanding Wacky Weights: A Dissection of SPLADE's Learned Term Importance
SPLADE models produce wacky expansion terms whose prevalence rises with larger vocabularies and falls with stricter sparsity; these terms primarily aid in-domain retrieval rather than out-of-domain generalization.
- Kernel Affine Hull Machines as Compute-Efficient Encoders for Frozen Semantic Spaces